Metadata for Digital Images in Public History Institutions
Metadata for Digital Images in Public History Institutions: Considerations and Techniques
Philip Croff
Definition of Project
Digital images are one of the most common material types held by public history institutions, which include, but are not limited to, archives and museums. The metadata attached to these images are crucial in allowing users to search and browse an institution’s collection and find what they are looking for, with the least amount of time, effort and assistance from staff. As metadata creation and maintenance is a major task for curators of digital images, exploring research in metadata creation and how users interact with metadata is worthwhile. Topics include search interfaces for tag-based retrieval, using GIS to manage digital image collections, creating metadata using automatic processes, how balancing metadata between its creators and its users benefits both, focusing on metadata elements that users find the most useful, using games to crowdsource descriptive metadata, creating accurate and complete metadata through crowdsourcing, the usefulness of free-text metadata, and the drawbacks of stripping metadata when digital image “go live” on the internet. Articles chosen are published after the year 2000 by recognized journals that pertain to the library and information science field. These articles and studies pertain to public history institutions, though articles regarding other institutions, such as those tied to universities, are also included if their considerations and techniques are usable in public history institution settings. Digital “images” and “photographs” will be used interchangeably.
Annotations
Bar‐Ilan, J., Zhitomirsky‐Geffet, M., Miller, Y. & Shoham, S. (2012). Tag‐based retrieval of images through different interfaces: A user study. Online Information Review, 36(5), 739-757.
This article examines various interfaces used for retrieving images through their assigned tags, and focuses on determining the best interface for retrieving images using tags. In this study, participants used a textbox interface, a tag cloud interface, and an interface where users chose concepts from an ontology created from tags assigned to images. The researchers found that tag cloud interfaces provided more accurate recall (in comparison to a textbox search), while the ontology interface provided the most accurate recall if users chose to conduct an expanded ontology search rather than a simple one. Although users found that these two interfaces provided better recall, the textbox required the least amount of time and effort. Users suggested that the ontology interface should be made simpler. As a result of this study’s findings, public history institutions may want to provide users multiple interfaces for searching their digital image collections. For example, casual users may desire to use a textbox search interface, as it is the least complex and is the most interface design familiar to them (based on their experience with interfaces like Google’s search engine). The researchers conclude that further work on developing an ontology-based search that is more appealing to users should be conducted.
Boyer, D., Cheetham, R., & Johnson, M. (2011). Using GIS to manage Philadelphia's archival photographs. American Archivist, 74(2), 652-663.
The City Archives’ collection, maintained by the Philadelphia Department of Records, is paired with Geographic Information Systems (GIS) technology, which allows users to use a map-based interface for browsing and searching for digital images. Users can also view sites through Google Earth and Street View to see how buildings and sites have changed over the years. Public support and use of this portal are high. Public history institutions with digital photograph collections may want to consider adopting a similar interface, as it allows for the management of the collection’s metadata (the primary reason for the creation of PhillyHistory.org), the public use of the collection, and enables the institution to create revenue from its holdings through the purchase of prints, items and gifts featuring prints, and licensing.
Corrado, E. M., & Jaffe, R. (2014). Transforming and enhancing metadata for enduser discovery: A case study. Italian Journal of Library, Archives and Information Science, 5(2), 33-48.
This case study focuses on the Binghamton University Libraries, which adopted Rosetta in 2011 for preserving digitized and born-digital materials, including photographs. This ongoing project (at the time of writing) seeks to extract the metadata embedded in the over 350,000 digital images held by the Libraries and the university’s photographer, to create descriptive metadata that can help users in searching the collection. Due to the volume of images, completing descriptive metadata for each image individually was deemed impractical. An automatic process was adopted for the creation of descriptive metadata for images in the collection. The process described in this study is invaluable to public history institutions faced with curating large amounts of born-digital images. It streamlines the initial creation of descriptive metadata, increasing the usability of the collection. This process means that staff do not have to devote time to analyzing and creating descriptive metadata for individual photos in a large collection.
Neal, D. (2008). News photographers, librarians, tags, and controlled vocabularies: Balancing the forces. Journal of Library Metadata, 8(3), 199-219.
This article explores how metadata creation between a user base and the professionals who curate the materials can be tailored to help both groups. Curators can focus their efforts, resulting in collections that are easier to search and use. In this case, photojournalists were surveyed to determine their preferences for metadata in a web-based archival system. In the study’s discussion of its results, it found that authoritative controlled vocabularies were not practical in creating metadata for their digital photographs for photojournalists, as they were not extensive enough and/or photojournalists do not trust them, resorting to browsing instead of searching. The researcher suggests that further work be conducted in requiring photojournalists to include a minimum amount of tags with their photographs before uploading them. Also, the researcher suggests that more study should be conducted regarding the use of tag cloud search interfaces and a “Did you mean…” feature when keywords that might be misspelled are entered. Photojournalists are one of the types of patrons a public history institution may seek to serve, perhaps as part of its outreach mission, as photojournalists are able to distribute the products of a public history institution (including photographs) to its local community and the rest of the outside world. Also, the digital photographs that photojournalists take may fall within the mission and collection development policy of a public history institution. Outside of metadata for photojournalists, the framework for surveying a user base regarding its metadata preferences is described in detail by the author, which can be applied to different types of digital photograph collections and user bases.
Fear, K. (2010). User understanding of metadata in digital image collections: Or, what exactly do you mean by ‘coverage’? American Archivist, 73(1), 26-60.
This study’s researcher expresses that little research has been conducted regarding how useful metadata is to the user communities of collections. Her study concludes that the terminology and content of Dublin Core provides is suitable for users who are not experts within the field they are studying. It also found that the interface that users interact with is important, as it may result in some elements not being thought of as useful, and too much text may prompt users to ignore it. With the findings of this study, public history institutions should be careful in designing the interfaces for their digital image collections, so they are as useful as possible. Many users do not find some of the Dublin Core fields to be as useful as other fields, and users still tend to prefer viewing images to determine if it meets their search criteria, with textual metadata being supplementary. Knowing this, public history institutions can develop and modify the search interfaces for their digital image collections, helping users locate the images they are looking for more efficiently and with less assistance from staff. As Dublin Core is often used by public history institutions, they can also put more emphasis on fields that users tend to perceive as being useful.
Flanagan, M., & Carini, P. (2012). How games can help us access and understand archival images. American Archivist, 75(2), 514-537.
This study’s researchers note that the volume of documents that archives receive has resulted in backlogs, limiting access to useful materials. They also note that item-level descriptions for digital images are very useful for users, but resources (including staff resources) are limited. Crowdsourcing projects for adding descriptive metadata to digital images has been attempted with the Library of Congress’ Flickr Pilot Project as well as the New York Public Library’s “What’s on the Menu?” project. This study involved the Metadata Games project, with the goal of encouraging and rewarding players of specially designed web-based (HTML5) games, where they contribute descriptions (including tags) to digital images. Game formats tested include users freely adding tags to describe images and another type where a user described an image to another user who would choose the image being described to them from an array of images in real-time. They conclude that games resulted in individuals contributing more entries than systems without games. In addition to users contributing tags, the authors express that Metadata Games could also be used to promote nonusers to engage with archival materials.
Gupta, D. K. & Sharma, V. (2018). Analytical study of crowdsourced GLAM digital repositories. Library Hi Tech, 35(1), 11-17. https://doi.org/10.1108/LHTN-07-2017-0055
This study concludes that crowdsourcing, or using the public and its knowledge to accomplish a goal, is useful in maintaining GLAM digital images, including in the curation of metadata. It is believed that the knowledge of multiple people (the crowd) is potentially greater than the knowledge of a single person, even if the single person is an expert. It allows for an institution to engage with remote users, curate metadata that is completer and more accurate, and accomplish goals when faced with limited budgetary or staff resources. Wikipedia operates in a similar manner. For example, people who witnessed an event first-hand can contribute metadata to photographs, perhaps better than an archivist could. With metadata being important for the usability of digital image collections held by public history institutions, crowdsourcing projects can improve the usability of their collections, while using fewer resources.
Han, M., Jackson, A. S., Palmer, C. L., & Zavalina, O. K. (2009). Evaluating descriptive richness in collection-level metadata. Journal of Library Metadata, 8(4), 263-292. https://doi.org/10.1080/19386380802627109
Although digital image metadata is often thought of as having to meet established standards and serve as a complete record, this study analyzed the metadata used by 202 public history institutions and found that the free-text Description field often provides users with more information about subjects and object types than metadata fields that are intended for those entries. From these findings, collection managers must recognize that free-text metadata entries can be accurate and rich. Collection managers must consider the usefulness of the Description field when curating metadata among digital photographs in their collections and understand how it impacts the usability of their collections.
Liew, C. L., & Lim, S. (2011). Metadata quality and interoperability of GLAM digital images. Aslib Proceedings, 63(5), 484-498. https://doi.org/10.1108/00012531111164978
Through digitization, public history institutions, including galleries, libraries, archives and museums (GLAM) have been able to share digitized and born-digital photograph collections through the internet. Metadata allows users to search for and use these resources, but this study found that the description and representation practices among various public history institutions vary considerably. This study found that this is because these four types of institutions have different goals and purposes for their collections, have collections of different natures, and that they place different levels of significance on metadata types. Another issue is that institutions may seek to meet institutional missions and goals over conforming to standards. A contributing factor to the lack of metadata standards by these institutions is how digital images are treated as surrogates for printed photographs. These are some of the challenges that those in charge of digital image collections for public history institutions face, as metadata is crucial for the usability of an institution’s digital collection. These findings also show that institutions must weigh meeting standards (whether they be between other institutions internationally, nationally, or in a consortium) against meeting the wants and needs of local patrons. Archivists and librarians must gauge the usage patterns of their collections and adjust their metadata accordingly.
Saleh, E. I. (2018). Image embedded metadata in cultural heritage digital collections on the web: An analytical study. Library Hi Tech, 36(2), 339-357.
This study found that images in cultural heritage digital collections were often stripped of their metadata. This finding is unfortunate for users, as user ability to discover and search digital image collections depends highly on the accuracy and richness of metadata. Among the collections that this study focused on, which included the commons for the Library of Congress, New York Public Library and the national libraries of several other countries, only 3% of images had keyword and subject metadata. This study concludes that public history institutions should overcome issues regarding metadata creation and curation for their digital image collections. Although institutions must weigh product versus process, the lack of metadata makes discovery and searchability much more difficult for users.