NewspaperPreservation

From SIS Wiki
Jump to: navigation, search

Newspaper Preservation

Annotations by Laura Gentry, Greg McElhatton, Elizabeth Nicholson, and Courtney Whitmore


Abel, R. (2013). The pleasures and perils of big data in digitized newspapers. Film History, 25(1/2), 1-10. doi:10.2979/filmhistory.25.1-2.1

Abel (2013) uses a case study of twentieth-century newspapers’ coverage of cinema to portray the benefits and perils of using digitized newspapers. He points out that researchers are no longer limited by geography in terms of accessing material and can often access it from the comfort of their home using the Internet. However, researchers lose the community of librarians, archivists, and other researchers that might know more about a particular source or can lead you to other relevant material during a visit to a library or archives. According to Abel (2013), digitization creates more awareness of materials, and in turn, researchers are now facing larger amounts of data to shift through and draw conclusions. Nevertheless, it becomes increasingly more difficult to generalize because it is much harder to determine how representative a sample that you are drawing from in a newspaper database. The geographical and temporal restrictions are somewhat hidden behind a search box intensive interface. Abel (2013) cites the example of GenealogyBank (Newsbank) as it lacked newspapers from Pittsburgh, Buffalo, St. Louis, San Francisco, or Peoria, Illinois and twentieth century coverage of the Chicago Herald, Milwaukee Sentinel, Boston Post, or Birmingham Age-Herald in his search for 1910s female cinema icons. As Abel (2013) points out, the power of newspaper database has helped find lost and nearly forgotten female cinema critics, like Daisy Dean or Syracuse’s “The Film Girl”. The benefits and perils of using digitized newspapers are definitely a balancing act. Abel (2013) concludes that digitization makes it easier to mine data, but it makes it more difficult to frame and interpret the data, which is the crux of writing history of any kind.


Allen, R. B., & Johnson, K. A. (2008). Preserving digital local news. Electronic Library, 26(3), 387-399. doi:10.1108/02640470810879527

Allen and Johnson (2008) examine the historic efforts at saving the local news from both an analog and a digital perspective and the news’ importance to cultural heritage. They analyze the cost of saving digital local news by looking at the storage space required and the cost to maintain that storage over a period of time. Allen and Johnson (2008) also developed a priority and rating system in order to determine what needed to be preserved based on a statistical formula. Indexing and browsing are other factors to consider along with the saving of duplicates. The authors examine different business models to fund preservation by looking at the following entities: government, commercial providers (news databases and news search engines), and private foundations.


Arvidson, A., Grenholm, O. (2011). Harvesting of online newspapers at the National Library of Sweden. In H. Walravens (Ed.), IFLA publications series, annual membership subscription: Newspapers: Legal deposit and research in the digital era (pp. 31-35). Walter de Gruyter. Retrieved from http://site.ebrary.com.proxy.lib.wayne.edu/lib/wayne/reader.action?ppg=17&docID=10486435&tm=1415064046205

The article begins by describing the origins of web harvesting activities by the National Library of Sweden, and follows by explaining the process with regards to newspapers. It then details the difficulties of encountering capture fails, specifically citing issues with coding and software. It explains how users may access the newspaper materials that they have harvested. Lastly, it discusses the different demographics of the paper formats (online and print, versus only online) and future considerations, which includes issues of frequency, and non-traditional versions of a “daily paper”, such as blog content.


Bailey-Hainer, B., & Sutton, S. (2007). All the news that's fit to digitize: Creating Colorado's historic newspaper collection. Serials Librarian, 52(1-2), 67-78. doi:10.1300/J123v52n01_07

The paper describes in detail the Colorado’s Historic Newspaper Collection project by summarizing a presentation on it, including: inception, features, access, collection development, user reactions, funding, and planning for the future. Notably, sections are devoted to describing a demonstration of the interface used for the project, and on issues of access. Interestingly, there is a section at the end which addresses the question and answer portion of the presentation.


Bingham, A. (2010). The digitization of newspaper archives: Opportunities and challenges for historians. Twentieth Century British History, 21(2), 225–231. doi:10.1093/tcbh/hwq007

Bingham (2010) points out the two main problems of conducting research using newspapers are accessibility and sheer volume, but digitization has relaxed that burden. Digitization allows for keyword searching, which brings about a more rigorous examination of the material and makes a richer content analysis possible. Bingham (2010) hypothesizes that the reliance on digitized materials may indirectly cause lazy researchers who choose a source based on the digital access instead of its value. He also argues that keyword searching has its own pitfalls because it can overlook synonyms and other uses of a word and overlook the larger context of an article and its placement within the newspaper. Many times, pictures, advertisements, and other visual materials do not accompany the digital version of an article. Digitization provides no insight into the production of a newspaper or the reception by its readers and must be accompanied by further research. Bingham (2010) encourages researchers to collaborate with those digitizing newspapers as well as commercial vendors of newspapers to keep pricing affordable to the scholarly community as a whole.


Bossen, H., Davenport, L. D., & Randle, Q. (2006). Digital camera use affects photo procedures/archiving. Newspaper Research Journal, 27(1), 18-32. Retrieved from http://search.proquest.com.proxy.lib.wayne.edu/docview/200710930?accountid=14925

Bossen, Davenport, and Randle (2006) examined the transition to digital cameras and photography and its effect on archiving and storage of photographs for publication in newspapers. Using a survey, they sought to find out about the frequency and use of digital cameras, if there are differences in what gets saved or archived in comparison to those pictures taken by a film camera, and how different demographic variables, such as circulation, staff size, and archiving policies, impact the proportion of digital imaged archived. Bossen, Davenport, and Randle (2006) found that digital cameras resulted in the photographers taking more pictures in contrast to shooting with 35mm cameras, but only 72 percent would be archived in comparison to 85 percent for those shot on film. When looking solely at published photos, 48 percent of digital photos ended up being archived in comparison to only 24 percent of photos taken with film. In a digital world, proof sheets are never printed, and any image that does not make an editor’s cut is deleted. Bossen, Davenport, and Rossen (2006) called for further research into the longevity and stability of digital storage and retrieval systems and possible obsolescence.


Burrows, S. (2007). Online access to newspaper content in Canada: Issues and concerns. Serials Librarian, 53(1-2), 151-161. doi:10.1300/J123v53n01_12

Burrows discusses the digitization initiatives of Library and Archives Canada, detailing National and Provincial projects revolving around newspapers. She focuses on the development of the projects, their purpose, and scope. Ontario’s HALINET project is a highlight in the article that discusses the created index, staff and volunteer work loads, and their use of Lizard Tech’s Multi Resolution Seamless Image Database. Burrows then goes on to voice issues and concerns associated with digital archives.


Caudle, D. M., Schmitz, C. M., & Weisbrod, E. J. (2013). Microform -- not extinct yet: Results of a long-term microform use study in the digital age. Library Collections, Acquisitions, & Technical Services, 37(1-2), 2-12. doi:10.1016/j.lcats.2013.02.001

Caudle, Schmitz, and Weisbrod (2013) conducted a microfilm use study at Auburn University libraries and found that microfilm and electronic versions are complementary to each other. Although the authors’ main focus was microfilm, they concentrated on newspapers as it was the most used genre of microfilm and compared and contrasted a microfilmed newspaper with its digital version. In addition, Caudle, Schmitz, and Weisbrod (2013) examined access and preservation of newspapers by looking at both the microfilm and digital mediums.


Daniels, C., Holtze, T. L., Howard, R. I., & Kuehn, R. (2014). Community as resource: Crowdsourcing transcription of an historic newspaper. Journal of Electronic Resources Librarianship, 26(1), 36-48. doi:10.1080/1941126X.2014.877332

Daniels, Holtze, Howard, and Kuehn detail the Archives and Special Collections (ASC) of the University of Louisville Libraries (ULL) journey to preserve and make accessible an historic African American newspaper. The article focuses on ASC’s crowdsourcing project to transcribe the digitized newspaper; including: the implementation of the crowdsourcing, the software used: OCR, CONTENTdm, and Scripto, and the building of the infrastructure and design of the transcription program. The article later details the workflow of the project, its marketing, and the successes and challenges of the endeavor.


Davenport, L., Randle, Q., & Bossen, H. (2007). Now you see it; now you don't. The problems with newspaper digital photo archives. Visual Communication Quarterly, 14(4), 218-230. doi:10.1080/15551390701730216

Davenport, Randle, & Bossen (2007) traced the evolution of newspaper archives and how the transition to digital has changed archiving practices. As newspapers serve as the chroniclers of history, these changes are indirectly affecting the historical record for future generations. To document and measure these changes, they explored practices and policies of archiving and accessing photos by surveying members of the National Press Photographers Association (NPPA). Their findings revealed that there is little standardization or best practices for preserving photos in this age of born digital images and newspapers. Davenport, Randle, & Bossen (2007) finally apply the ideas of mass communication theorist, Marshall McLuhan, who theorized the “medium is the message” to the digital archives of newspapers.


Deacon, D. (2007). Yesterday's papers and today's technology: Digital newspaper archives and 'push button' content analysis. European Journal of Communication, 22(1), 5–25. doi:10.1177/0267323107073743

Deacon (2007) postulated the main issues in archiving newspapers were storage, information retrieval, and access. He focuses on how newspapers and researchers are affected by the digitally based push button content analysis by studying the Lexis-Nexis newspaper archive. Keyword searching is one of these changes in search methods when looking at digital newspaper archives. Deacon (2007) laments the loss of visual dimension of the news as often digital version of newspapers lack the images, advertisements, and layout found in a print newspaper and suggests an emphasis on linguistics. Searching has become more reliant on the actual text made possible by keyword searching instead of looking at the larger context. Digital newspaper archives have limits in terms of coverage as it is more often recent news or from the last two decades. Deacon (2007) discusses the concept of research reliability in which computerized searches produce consistent, reliable, and replicable results over time.


Evans, M. R. (2007). The digitization of African American publications. Serials Librarian, 53(1-2), 203-210. doi:10.1300/J123v53n01_16

Broadly, this paper covers African American publications and the repositories that hold them, efforts to increase access to these publications, and issues encountered in pursuing digitation projects. Notably, it lists and discusses several electronic resources for accessing these types of publications, before launching into a section explaining the increasing pervasiveness of institutions realizing the benefits of making content of this subject matter more easily available through digital repositories. It closes by explaining difficulties in digitizing, both broadly, and specific to African American materials.


Fleming, P., & Spence, P. (2008). The British Library newspaper collection: Long term storage, preservation, and access. Liber Quarterly, 18(3), 377-393. Retrieved from http://liber.library.uu.nl/index/lq/article/view/7937/8205

The article describes the strategy to improve the collection and preservation plan of the British Library Newspaper Collection. It explains the rationale for the project, a description of the collection, and their long-term “vision” for storage and preservation. The article is particular interesting in that they describe in detail the state of the collection, much of which is unusable in its physical format because of deterioration. The storage and preservation plan is bulleted and succinct, and therefore easy to follow. Additionally, the article describes in detail plans for the new facility, moving the collection, the access system, and creating a strategy for access sustainability.


Geiger, B., Snyder, H., & Zarndt, F. (2011). Preserving and accessing born digital newspapers: A perspective from California. In H. Walravens (Ed.), IFLA publications series, annual membership subscription: Newspapers: Legal deposit and research in the digital era (pp. 31-35). Walter de Gruyter. Retrieved from http://site.ebrary.com.proxy.lib.wayne.edu/lib/wayne/reader.action?ppg=45&docID=10486435&tm=1415063300633

Geiger, Snyder & Zarndt (2011) explain the state of newspaper production in California, in which most newspapers are now produced entirely digitally, but also in which many forms of saving these papers is haphazard, if pursued at all. The article then describes the projected plan for creating a viable method of archiving born digital newspaper, and of enabling publishes, especially smaller ones, to perform work to that end themselves. Included in this plan is conducting a study of best practices and developing software, both of which processes the articles describes.


Gustafson, K. L. (2014). Translation, technology, and the digital archive: Preserving a historic Japanese-language newspaper. American Journalism, 31(1), 4-25. doi:10.1080/08821127.2014.875349

Gustafson (2014) discusses her experiences with the Nikkei Newspaper Digital Archive Project (NNDAP), which is a joint project of the Hokubei Hochi Foundation and University of Washington Libraries to create a digital archive of a Seattle-based, Japanese language newspaper. Through this case study, she examines the design decisions that shape digital newspaper archives, the role of preservation methods and their long-term impact on future historians’ use of newspaper archives, and the selection of material for inclusion. Gustafson (2014) points to digitization as being partially responsible for loss of historical context found in newspapers as digital versions of newspapers focus on displaying the article alone and not what surrounds the article. Keyword searching also compounds this problem and changes how researchers conduct research. Gustafson (2014) examines the role of content selectors, including librarians, funders, and commercial companies and how the content in newspaper digital archives are shaped and manipulated by a complex process in which some content is excluded and elevation of other content. This selection process shapes the narrative that future historians will write.


Hasenay, D., & Krtalic, M. (2010). Preservation of newspapers: Theoretical approaches and practical achievements. Journal of Librarianship and Information Science, 42(4), 245-255. doi:10.1177/0961000610380818

In this article Hasenay and Krtalic examine the importance of newspaper preservation and break down the process into the concepts of preserving the original newspapers and preserving the information. In preserving the information, the two methods that are discussed are microfilming and digitization. They explain step-by-step, how to set up a newspaper digitization project with the following topics: project organization, organization, selection and acquisition of material, technical performance, and the importance of cooperation, evaluation and quality control. As an example, they use the digitization efforts of Croatia and highlight how their projects could improve.


Holley, R. (2009). How good can it get?: Analysing and improving OCR accuracy in large scale historic newspaper digitisation programs. D-Lib Magazine: The Magazine of Digital Library Research, 15(3/4). doi:10.1045/march2009-holley

Holly's article examines the usage of OCR technology within the National Library of Australia's Newspaper Digitisation Program (ANDP). The paper details how OCR technology has advanced over the years and how it would be used in a newspaper preservation effort. Topics include the factors that can affect OCR accuracy in historic newspapers, how to measure accuracy rates when using OCR (and what a good level of OCR accuracy would be), and how to improve OCR accuracy. Holley then details the efforts from ANDP to create a search system for historical newspapers using OCR and the methods in this article, and the success rates achieved.


James-Gilboe, L. (2005). The challenge of digitization: Libraries are finding that newspaper projects are not for the faint of heart. Serials Librarian, 49(1),155-163. doi:10.1300/J123v49n01_06

James-Gilboe attributes the unique digitization challenges that historical newspapers present to “large image size, complex formats that change from page to page, and stories that are continued on different pages.” Other challenges include hidden headlines in tightly packed text blocks and small margins. Despite these obstacles she explains how digitization is valuable and gives a “Digitization Roadmap” on how to personally plan a project. She delves into the pros and cons of microfilm vs. digitization while using the “article-focused approach”, and discusses quality, file formats, and linking articles. ProQuest is used as an example for accuracy standards, topical selection, and their collaboration with different libraries.


Jefferson, R., Taylor, L., & Santamaria-Wheeler, L. (2012). Digital dreams: The potential in a pile of old Jewish newspapers. Journal of Electronic Resources Librarianship, 24(3), 177-188. doi:10.1080/1941126X.2012.706109

Jefferson, Taylor, and Santamaria-Wheeler detail the Isser and Rae Price Library of Judaica at the University of Florida (UF)'s program to digitize a special collection of international Jewish newspaper anniversary editions. After explaining the scope of the collection, the authors explain the software and techniques to be used for digitization, as well as how to select which items to convert to a digital format. Selection decisions were based on acquiring a varied amount of content (languages, places of publication), the newspaper's history and reputation, and copyright status. Technical decisions made include using the Metadata Encoding and Transmission Standards (METS) format, SobekCM open-source software (in conjunction with the UF Digital Collections system), and to add additional metadata after OCR scanning.


Kanungo, T., & Allen, R.B. (2007). Full-text access to historical newspapers. Star, 45(1). Retrieved from http://search.proquest.com.proxy.lib.wayne.edu/docview/23940188?accountid=14925

Kanungo and Allen's paper guides the reader through the National Endowment for the Humanities' system to take archived 19th century newspapers and transform them to digital copies available to all. The authors explain why traditional OCR software packages will fail when attempting to translate newspapers, and then how their package analyzes and breaks down a newspaper broadsheet using zone segmentation into information that can be converted to digital. Once digitized, the project members catalog the information, using multiple layers of metadata that recognize both content as well as the original format and layout of the newspaper. Finally, an interface needs to be created to access the material, both for those inspecting and updating the OCR as well as for researchers and students accessing the content of the collection.


King, E. (2005). Digitisation of newspapers at the British library. Serials Librarian, 49(1), 165-181. doi:10.1300/J123v49n01_07

Since the 1990s, the British Library has engaged in digitizing its archived newspaper holdings. Problems with using OCR software with newspaper broadsheets are explained, as well as issues with indexing the material in a manner that is usable for researchers. Three case studies from within the British Library are detailed, showing the progression of technology and approaches in newspaper digitization that result in increasing success. Multiple XML layers, OCR abilities, and copyright status are all highlighted as steps along the way. International case studies on similar newspaper digitization projects are also spotlighted.


Klijn, E. (2008). The current state-of-art in newspaper digitization: A market perspective. D-Lib Magazine, 14(1), 5. doi:10.1045/january2008-klijn

In preparation for the digitizing of over 8 million pages from Dutch newspapers over the centuries, the Koninklijke Bibliotheek (the National Library of the Netherlands)'s Databank of Digital Daily Newspapers performed a survey on other efforts being performed around the world. Klijn details the information found, with sections on digital imaging technology, OCR, zoning and segmentation, metadata extraction, searchability and web delivery systems. The article serves as a snapshot into 2008's newspaper digitization technologies as well as a basic primer for others considering a similar project.


MacLennan, B., McMurdo, T., & Kilb, M. (2013). Vermont digital newspaper project: From reel to real. Serials Librarian, 64(1-4), 151-157. doi:10.1080/0361526X.2013.760147

The Vermont Digital Newspaper Project was initially formed in 1997 by the University of Vermont Libraries and the Vermont Department of Libraries to participate in the national United States Newspaper Program. During their four years of participation they inventoried and cataloged approximately 1,000 newspapers, microfilmed 170,000 pages of newspapers, and created an online database. Building off of this project in 2009, another Vermont coalition was formed, this time, to join the National Digital Newspaper Program (NDNP). Funded by the National Endowment for the Humanities, their goal was to “select, digitize, and make freely available up to 100,000 pages of historic Vermont newspapers.” MacLennan and McMurdo relate the guidelines of the NDNP project and detail how the Vermont Project adhered to them. Explained in detail is the setup of the organizational infrastructure, the breakdown of the different steps in the project, and observations that were made along the way.


MacQueen, D. S. (2004). Developing methods for very-large-scale searches in Proquest historical newspapers collection and Infotrac The Times digital archive: The case of two million versus two millions. Journal of English Linguistics, 32(2), 124-143. doi:10.1177/0075424204265944

MacQueen's article highlights the size of newspaper digital archives, focusing on Proquest's historical newspaper archives and Infotrac's The Times, and moves through the difficulties in finding information in a source so large. MacQueen shows how to use the tools built into newspaper archives to turn the seemingly-impossible into an almost manageable task.


Matusiaka, K. K., & Munkhmandakh, M. (2009). A newspaper/periodical digitization project in Mongolia: Creating a digital archive of rare Mongolian publications. Serials Librarian, 57(1-2), 118-127. doi:10.1080/03615260802669136

This article explains the two year process of digitizing and archiving rare historical newspapers and periodicals of Mongolia. The discussion covers the background, project goals, selection of materials, digital imaging, building the collection, preservation and documentation. Notable sub-sections include a discussion on digitizing paper versus digitizing from microfilm and a section that focuses on the difficulties that creating derivative files may cause when building an online collection.


McMurdo, T., & MacLennan, B. (2013). The Vermont digital newspaper project and the national digital newspaper program: Cooperative efforts in long-term digital newspaper access and preservation. Library Resources & Technical Services, 57(3), 148-163. Retrieved from http://search.proquest.com.proxy.lib.wayne.edu/docview/1443490664?accountid=14925

This article presents an overview of the Vermont Digital Newspaper Project (VTDNP), and the role in plays in its collaborative partnership with the National Digital Newspaper Project (NDNP). While a decent amount of space is utilized in explaining the NDNP, much of the focus is on detailing the methodology and experiences of the VTDNP. In particular, the largest section entitled “Execution” goes into specific detail over topics such as output components, quality assessment, and the steps in their digitization process. Lastly, it discusses outcomes of Phase I of the project, and plans for Phase II.


Mieczkowska, S., & Pryor, K. (2002). Digitised newspapers at Norfolk and Norwich millennium library. Collection Building, 21(4), 155-160. doi:10.1108/01604950210447395

Mieczkowska and Pryor report on one of the first newspaper digitization projects in the United Kingdom performed by the Norfolk Library and Information Service. First, they discuss the limitations of microfilm and concerns of the longevity of digital material, and then they move on to the Norfolk project itself which started in 1999 and ran until 2001. Included in this report are: the settings and equipment used, an explanation of the created index, and the staff that were utilized. The advantages coming from this project are that the index is available countywide and copies can be ordered and disseminated. They conclude that the disadvantages of the project are far outweighed by the successes; especially those of starting with original negative microfilm and ending up with a digitized and indexed product.


Murray, R. L. (2005, June). Toward a metadata standard for digitized historical newspapers. In Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries (pp. 330-331). ACM.

This case study has Murray detail the metadata development within the National Digital Newspaper Program, an initiative that expands access to historical newspapers. Murray explains the need for structural metadata when digitizing newspapers, and how there can be complex links between each digitized piece of information. The solution enacted involves the Metadata Encoding and Transmission Standard (METS) and Metadata Object Description Schema (MODS), then discusses the basic setup of the information involving a title document, an issue document, a page object, and a reel document. The end result is an easier exchange of information between newspaper digitization projects.


Mussell, J. (2012). The passing of print. Media History, 18(1), 77-92. doi:10.1080/13688804.2011.637666

Mussell (2012) examines how newspapers are examples of ephemera by tracing the rise of newspapers. He considers how the passage of print into a born-digital world is changing ephemera and cultural memory as a whole. Mussell (2012) examines the paradox of the digital with the persistence of some digital items and the possibility of no digital record. He asserts that digital objects can survive as long as there is a hardware and software available to process them. Mussell (2012) places that responsibility on existing institutions of memory as curators and cites the development of institutional digital repositories as institutions’ acceptance of this role. However, these existing institutions of memory often lack the resources to ensure the long- term digital preservation happens. Mussell (2012) points to the scholarly community as a partner in this endeavor to help solve the problem of saving the news of today for the future. Digital content is everywhere, but only a fragment of what is produced will survive.


Reakes, P., & Ochoa, M. (2009). Non-commercial digital newspaper libraries: Considering usability. Internet Reference Services Quarterly, 14(3-4), 92-113. doi:10.1080/10875300903336357

Reakes and Ochoa survey the challenges in creating digital newspaper libraries through case studies of usability testing done on the University of Florida's Florida Digital Newspaper Library (FDNL) and the Chronicling America/National Digital Newspaper Project (NDNP). Issues examined include search pages being intuitive, ease of navigation, result pages providing information needed, and a lack of proper metadata in some archives. Direct comparisons between FDNL and Chronicling America are regularly presented to the audience to help highlight the differences in approach between the projects.


Reilly, B. (2007). The library and the newsstand: Thoughts on the economics of news preservation. Journal of Library Administration, 46(2), 79-85. doi:10.1300/J111v46n02_06

Reilly (2007) places the burden of responsibility of preserving newspapers into the hands of libraries. Born digital newspapers have changed the relationship between libraries and the news media. Previously, libraries ensured print newspapers would be around for future generations of readers by keeping old issues tucked away on shelves or through microfilm, which indirectly relieved newspaper publishers the cost of maintaining and servicing their newspaper archives. The Internet is the newsstand of choice and convenience as paper subscriptions continue to decline, and libraries have not effectively mastering web archiving to ensure that an online news story here today will be available years from now in the future. According to Reilly (2007), the consolidation of news media has indirectly created more homogenous content as there is a narrowing the spectrum of news and opinion. As commercial news vendors and aggregators are expected to maintain the news for scholars, there will be gaps in historical record because it is not profitable as re-use is limited and often years ahead in the future. Reilly (2007) recommends that news organizations grant limited, specific digital uses of their intellectual property to libraries for the purpose of archiving the content in electronic form. Libraries cannot shoulder this financial burden alone, and Reilly (2007) suggests enlisting the government, private sectors, and foundations as investors. Libraries also must learn to share even more with each other and change the collection development model to embody a community network of libraries sharing resources and preservation responsibilities to enable the necessary preservation to take place.


Reilly, B. F., & Simon, J. (2010). Shared digital access and preservation strategies for serials at the center for research libraries. Serials Librarian, 59(3-4), 271-280. doi:10.1080/03615261003619060

The World Newspaper Archive, launched in 2008, was created by The Center for Research Libraries (CRL), its partner institutions, and Readex. The World Newspaper Archive (WNA) is a collaboration of libraries and electronic publishers worldwide, that had an interest in collectively digitizing their holdings. The three major goals of this program are: community access, longevity and “continued functionality of the news content for its community”, and growth in its operations. Reilly and Simon detail the first effort of the WNA as being Latin American newspapers and specifically the content of ICON, International Coalition on Newspapers, which has a huge Latin American representation. From there the standards of content selection are reviewed, the details of the projects structure and execution are explained, and future phases of the operation are discussed. Examples of their work are also shown as screen shots in the article.


Seib, R. (2002). Exilpresse digital: The deutsche bibliothek's digitization of selected German exile periodicals and newspapers from the 1933-1945 period. Serials Librarian, 43(2), 29-39. doi:10.1300/J123v43n02_04

Seib follows Deutsche Bibliothek's Exilpresse digital project, converting newspapers and serials from 1933-1945 into a digital format, beginning with the selection of materials for conversion (based on physical condition, availability, copyright, and completeness). The article then shifts to the technical side of the project, discussing the lack of strong OCR software at the start of the effort, and the decision to manually transcribe all metadata in an SGML structure with a DTD modeled after Dublin Core to record individual units of data. Finally, the article examines the search abilities of the database using the metadata captured by the project.


Shaw, J. (2005) 10 billion words: The British Library British newspapers 1800–1900 project: Some guidelines for large-scale newspaper digitisation. Retrieved from http://www.ifla.org/IV/ifla71/papers/154e-Shaw.pdf

Shaw gives an in-depth look into the British Library’s newspaper project. The scope of the project was to “digitise a large volume of historic newspapers with the highest possible quality.” They determined that in order to accomplish this goal, they needed to have planned the project thoroughly before starting, know the material they were working with, and have adequate resources for their team. For this project the British Library established standards specifically for large-scale digitization which include: “condition survey/assessment of source material to act as a benchmark, filming one page per frame to ensure a consistent look, only digitize from microfilm for speed, consistency and cost, and human intervention” to assist in quality control. This article also includes the technical processes that were involved with this project.


Silverman, R. (2014). What, no backups? Preserving hardcopy newspapers in the digital age. Retrieved from http://www.ifla.org/files/assets/newspapers/Geneva_2014/s6-silverman-en.pdf

Silverman discusses the history behind the preservation of newspapers and books and what prompted national funding to be created for library preservation in the 1980s. The virtues of microfilm are discussed along with the disadvantages of physical damage that can occur when microfilming. Silverman goes on to examine all of the issues associated with “digitizing” historical newspapers and makes a strong argument for keeping the originals after the digitizing process is over and for practicing preventative maintenance on original material.


Summerlin, D. (2014). Selecting newspaper titles for digitization at the Digital Library of Georgia. D-Lib Magazine, 20(9/10). doi:10.1045/september2014-summerlin

In this case study, Summerlin describes the many factors affecting the selection process for the Library of Georgia’s newspaper digitization program. The author explains that the intention was to find a balanced approach, which did not place any one aspect higher than others. While this aim was jeopardized somewhat by subjects that were constrictive in nature, such as copyright, by increasing emphasis on some other aspects, they felt they were able to reach a compromise on the issue. After explaining briefly each criterion, the discussion turns to usage of the site, which papers received the most attention, and the larger social and historical reasons for some surprising results in that regard.


Sweeney, M., & Hawkins, L. (2007). The national digital newspaper program: Building on a firm foundation. Serials Review, 33(3), 188-189. doi:10.1016/j.serrev.2007.05.005

In this article, Sweeney and Hawkins briefly introduce the National Digital Newspaper Program, its origins in the United States Newspaper Program, and the database site, Chronicling America, that will result from it. The program is a collaborative effort between The National Endowment for the Humanities and the Library of Congress, who in turn work alongside many organizations in the greater library community. The brief article covers how the program is structured and the phases of the project. The initial phase entailed the NEH providing awards to six institutions who, along with the LC, digitized pages of newspapers published from 1900-1910. The LC was in charge of much of technical logistics and workflow development. The article also discusses plans after the initial phase completed.


Tanner, S., Munoz, T., & Ros, P. H. (2009). Measuring mass text digitization quality and usefulness. D-Lib Magazine: The Magazine of Digital Library Research, 15(7/8). doi:10.1045/july2009-Munoz

Using a case study measuring the British Library's 19th Century Newspapers Database, the authors measure OCR accuracy by not only measuring individual character accuracy, but also word and significant word accuracy. The article gives a history of OCR and how it works, then explains why when searching articles, some words being correctly identified are more important than others. The output from OCR scans are examined, detailing how the resulting information can be used for a stronger methodology in converting newspapers to a digital format.


Waldman, M. (2004). International newspapers and research. Serials Librarian, 45(4), 71-80. doi:10.1300/J123v45n04_06

Waldman (2004) examines the reasons why coverage of international newspapers is limited and the larger implications for researchers. Local coverage of any event is typically the best source of information, and it is imperative for researchers to have access to local newspapers in order to get different perspectives on an event. Waldman (2004) points out that very few international newspapers are intentionally preserved either through microfilming or digitization by commercial vendors. Online or born digital newspapers offer more immediate access, but many lack archives or are behind a subscription pay wall. Waldman (2004) asserts that many online newspapers are dynamic, which causes problems to create accurate source information for later researchers. International newspapers are vital sources of information for researchers, and more needs to be done to ensure the content is available in the future.


Widmer, L. J., Taylor, L. N., & Sullivan, M. V. (2012). Florida digital newspaper library: Library and publisher partnerships for access and preservation. Florida Libraries, 55(2), 15-17. Retrieved from http://www.flalib.org/fl_lib_journal/Fall2012.pdf

This article seeks to describe the status of the Florida Digital Newspaper Library, with emphasis on their efforts to create a more sustainable workflow. The workflow discussion explains that by partnering with publishers, the University of Florida eventually moved on to receiving papers as born-digital files, rather than in print, although some papers were still received in print in 2012. Doing this, they argue, makes the process more efficient by eliminating the need to digitize the print papers. This has enabled them to meet demand while decreasing the amount of work-time and devoted staff. The article ends with a status report of the current collections included in the digital library.


Xuyan, C. (2012). Saving our past into the future: The preservation and digitisation of old newspapers at Shanghai library. International Preservation News, (56), 21-28. Retrieved from http://search.proquest.com.proxy.lib.wayne.edu/docview/1033523474?accountid=14925

Xuyan takes an exciting look into the digitization of old newspapers at the Shanghai Library in China. The Shanghai Library is the second largest library in China and holds the finest collection of historical newspapers in the country. In addition to having many valuable Chinese newspapers they also host 92 titles of different foreign language newspapers from 1851 to 1949, many of them European languages. Xuyan examines the excellent storage facilities and conservation treatments performed at the Shanghai Library; and then delves into their efforts of preservation microfilming and more recently their digital scanning of the Chinese newspapers. Equipment and settings are discussed as well as how they generated metadata and their transformation from MARC to the Dublin Core schema.