Take Me Back

From SIS Wiki
Jump to: navigation, search


Ture Farwell - Take Me Back - A look at the Wayback Machine

Definition of Project: This project aims to understand what the Internet Archive is and how various individuals use it. The different aspects that are looked at involve the reliability of the Internet Archive and any legal issues that could arise while using the service. Additional articles are also referenced that expand on these topics outside of the initial scope.


Annotations

Alnoamany, Y., Alsum, A., Weigle, M. C., & Nelson, M. L. (2014). Who and What Links to the Internet Archive. Research and Advanced Technology for Digital Libraries Lecture Notes in Computer Science, 346–357. https://doi.org/10.1007/978-3-642-40501-3_35

The article was written by Yasmin AlNoamany, Ahmed AlSum, Michele C. Weigle, and Michael L. Nelson. The study's purpose was to analyze what individuals are looking for, where they come from, how many of their pages link to the Wayback Machine, and why they are using the Wayback Machine. The study found that most of the Wayback Machine users are English speakers, followed by Europeans. It also found that most users came because they could not find the website they were looking for on the live internet and that over half of the internet pages no longer exist. Due to how the software is set up, some information could not be acquired, such as its users' location data.


Andersen, H. (2013). website owner's practice guide to the wayback machine. Journal on Telecommunications & High Technology Law, 11(1), 251-278.

Holly Anderson wrote the article in 2013 for the Journal on Telecommunications & High Technology Law. The report looks at a legal issue that can arise when using the Wayback Machine and covers the subjects of screenshots as evidence, litigation, and access. At 29 pages, there is too much to summarize quickly, but it does cover issues that one could consider important. Some of the mentioned individual topics are ways to stop a crawler from recording a page and an overview of what the Wayback Machine is and how it works. More importantly, the article offers examples of different court cases and laws dealing with an Internet Archive topic.


Arora, S. K., Li, Y., Youtie, J., & Shapira, P. (2015). Using the wayback machine to mine websites in the social sciences: A methodological resource. Journal of the Association for Information Science and Technology, 67(8), 1904–1915. https://doi.org/10.1002/asi.23503

The article was written by Sanjay K. Arora and Yin Li, Jan Youtie, and Phillip Shapira in 2015 for the Journal of the Association for Information Science and Technology. The Internet Archive is an online repository with the goal of archive the world wide web. As a service, it allows individuals to track information and trends over time, with the earliest websites dating back to 1996. When dealing with the social sciences, the Internet Archive is a valuable resource. The article looks at the different steps individuals or individuals can go through to successfully use the Internet Archive as a data source for research. These steps include sampling, organizing, and defining the web crawl's boundaries, crawling, website variable, operationalization, integration with other data sources, and analysis. (Arora et al., 2015) The result comes down to the research quality and the effort put into it by an individual involved.


Berčič, B. (03/01/2005). Protection of personal data and copyrighted material on the web: The cases of google and internet archive Carfax Pub. Co. doi:10.1080/1360083042000325283

Bostjan Bercic wrote the article in 2005 for Information & Communication Technology Law. It is an older article that looks at the topic of personal data and copyright on the web. The first part focuses on search engines such as google while the second part looks at Internet archives. The decision that is reached by the author is that internet archives should be approached in the same way that one would approach a search engine. As both items can deal with intellectual property and personal information. Fifteen years later, with the rise of social media and the smartphone's advent, these same issues are more prevalent than ever.


Eltgroth, D. R. (10/01/2009). Best evidence and the wayback machine: Toward a workable authentication standard for archived internet evidence Fordham University School of Law.

Deborah R. Eltgroth wrote the article in 2009 for the Fordham Law Review. The paper addresses the issues that are faced by using material that has been recorded by the Way Back Machine in a court case. As the internet is always in a flux state, having a set standard would allow for reliable evidence to be presented. To help facilitate this, the author looks at using 'Rule 901" to accommodate proof brought forward in a court case. It does this by going over case law, by looking at different ways to approach the issue, and finally by looking at possible alternatives to the point. If implemented and done correctly, it should allow evidence to be beneficial and play a significant role in a trial.


Sampath Kumar, B. T., & Prithviraj, K. R. (2015). Bringing life to dead: Role of Wayback Machine in retrieving vanished URLs. Journal of Information Science, 41(1), 71–81. https://doi.org/10.1177/0165551514552752

The article was written by B.T. Sampath Kumar and K.R. Prithviraj in 2015 for the 41st volume of Journal of Information Science. The report looks at a study that encompasses 5698 URLs in 1700 published articles from an Indian conference proceeding published between 2001-2010. (Kumar and Prithviraj, 2015) It does this by looking at information that has already been provided, the study's objectives, and the methodology. In the end, the study showed that there was an increase in using and citing URL citations. It also found that older citations disappeared at a higher rate than new citations. The Wayback machine was used throughout the study to analyze the different citations as they existed at one point or another.


Kumar, B. T. S., Kumar, D. V., & Prithviraj, K. (2015). Wayback machine: reincarnation to vanished online citations. Program, 49(2), 205–223. https://doi.org/10.1108/prog-07-2013-0039

The article was written by B.T. Sampath Kumar, D. Vinay Kumar, and K.R. Prithviraj in 2013 for Emerald publication. The study sought to discover the rate at which online sources decay while also looking to recover these same sources by using the Wayback Machine. The study selected three publications from Emerald Publications. Of the 389 published articles, a total of 15,211 citations were used. In total, only 1,930 were online citations. These citations were then run through a W3C checker to check the link's viabilities. The study results showed a degree of variance between the five-year period that was being examined. That only 48.33% of articles had been archived. In contrast, 51.67% of the articles had not been archived.


Lueck, T. (04/03/2014). Internet archive: Digital library of free books, movies, music, and wayback Machine/The internet archive companion American Journalism Historians Association. doi:10.1080/08821127.2014.905381

Terry Lueck wrote the article in 2014 for an issue of American Journalism. While less in-depth than some pieces, it does a decent job introducing the reader to what the Internet Archive offers individuals who use their service. Some of the services offered are web, video, live music, software, and texts regarding different media forms. (Lueck, 2014) It can also act as an "open library." Letting users borrow books or other forms of media like a more formal library. Additional, other services allow users to watch old television programs, films, and sporting events. It gives a short history of the Internet Archive and an overview of their selection and acquirement process. Overall, it offers an excellent introduction to the services that are provided by the Internet Archive.


Milligan, I. (03/01/2016). LOST IN THE INFINITE ARCHIVE: THE PROMISE AND PITFALLS OF WEB ARCHIVES Published by Edinburgh University Press for the Association for History and Computing. doi:10.3366/ijhac.2016.0161

Ian Milligan wrote the article in 2016 for the International Journal of Humanities and Arts Computing. The paper attempts to offer internet archiving and where it needs to improve by looking at three different case studies. The various case studies looked at were Wide Web Scrape, political movements, and GeoCities. The Wide Web Scrape is 2,713,676,341 files stored across 85, 570 WebARChive (WARC) files. (Milligan, 2016). As the author is from Canada, he takes the time to look at the different websites associated with the various Canadian political parties. The last case study looked at GeoCities, which was a popular web hosting platform. Yahoo acquired it in 1999, and at the time, it was the third most visited website on the world wide web. (Milligan, 2016) With all three cases, the author looks at different issues, such as digital rights and copyright. The author's final point is that we need to be prepared as these websites were not created for an internet archive and that regardless of what happens, steps should be taken to prepare for the inevitable.


Oury, C., & Poll, R. (2013). Counting the uncountable: statistics for web archives. Performance Measurement and Metrics, 14(2), 132-141. http://dx.doi.org.proxy.lib.wayne.edu/10.1108/PMM-05-2013-0014

The article was written by Clement Oury and Roswitha Poll in 2013 for Performance Measurement and Metrics. The purpose of the paper is to describe ISO Report ISO/TR 14873. (Ourty & Poll, 2013) Internet archiving has been around since the early 1990s. Since that time, various steps have been taking to help standardize the process, such as ISO 2789: international library statistics. Some of the standard practices that are brought up by the article are "The library has a clear statement of the intended coverage of the collection and succeeds in following it" (Ourty & Poll, 2013, p. 8) and "The library has implemented long-term preservation procedures." (Ourty & Poll, 2013, p. 8) While only briefly mentioned in the article, having standardized practices helps deal with copyright law or legal deposit.