Managing Audio Metadata
Can You Hear Me Now? Managing Audio Metadata for Future Access
Annotations by David Serra
Definition of Project
- While there are many guides detailing the processing workflows for an archive in regard to digitization and preservation of analog audio formats to digital audio formats, this annotated bibliography will attempt to capture the research around the metadata management of those digital audio files. The proper administration of metadata associated with digital audio files helps ensure this content will be accessible in the future. The selection of publications is limited to the last fifteen years, since the challenges faced in digital audio preservation have mainly been addressed in that timeframe. The literature selected covers issues regarding metadata standards and challenges around extraction. The bibliography includes peer-reviewed journal articles and conference proceedings. Search terms that have yielded the best results include: “music metadata,” “audio metadata extraction,” “digital preservation of sound recordings,” “audio metadata interoperability,” “digital audio accessibility,” and “digital audio curation.”
Annotations
Clair, K. (2008). Developing an audiovisual metadata application profile, Library Collections, Acquisitions, & Technical Services, 32(1), 53-57. https://doi.org/10.1080/14649055.2008.10766193
- Clair discusses the metadata standards currently used for audiovisual materials, such as, Dublin Core (DC), Metadata Encoding and Transmission Standard (METS), PBCore, and Motion Picture Experts Group (MPEG), while presenting a case study for a digital project implementing guidelines for metadata management at Penn State University Libraries that could be considered for use at other institutions. Due to its “ability to transmit only certain segments of a program” (p. 54), PBCore was applied to the Rabin Collection and was determined to work at a local metadata level. Because this case study was the first attempt to apply audiovisual metadata to the Rabin Collection, more insight is needed to make sure the standard will work across other Penn State University collections. Transparency of metadata standards between institutions could help create consistency and ease of access to information seekers, which is highlighted by the local results of this case study.
Correya, A., Hennequin, R., & Arcos, M. (2018). Large-scale cover song detection in digital music libraries using metadata, lyrics and audio features. CoRR. https://arxiv.org/abs/1808.10351
- In the recent past, song identification, or Music Information Retrieval (MIR) was mainly performed through methods of analyzing tonal aspects of audio. The tonal elements, such as key, tempo, etc., were a useful match to metadata in datasets. Correya et al. use metadata datasets containing lyrics, along with audio analysis, to identify cover versions of songs. While the focus of this research study addresses the identification of specific versions of cover songs, the methods described here could possibly be adopted to identify those songs that are not exclusively cover songs, but variations of songs for accurate identification and metadata organization. Both of these techniques could aid musicologists and others dealing with large music datasets.
Corthaut, N., Govaerts, S., Verbert, K., & Duval, E. (2008). Connecting the dots: Music metadata generation, schemas and applications. In J. P. Bello, E. Chew, & D. Turnbull (Eds.). Proceedings of the Ninth International Conference on Music Information Retrieval (ISMIR), September 14-18, 2008, Philadelphia, PA (pp. 249-254). https://www.researchgate.net/publication/200688582_Connecting_the_Dots_Music_Metadata_Generation_Schemas_and_Applications
- This conference paper addresses the music metadata schema to be used within a computer science department’s music metadata application tools, which were created previously to this schema implementation. These tools could be applied to any digital music collection. Domains (the software or source that generates the metadata), metadata standards, and metadata field clusters (elements) were compared via tables based on equations to see how they were related or useful to one another. The authors provide some insight into decision making around metadata choices for applications when music metadata was becoming an increasingly popular topic. The results of the choices are weighed against their tables when determining function, parameters, and music metadata format outcomes.
Herring, M. (2015). An application profile of MODS to describe complex digital musical audio resources, Journal of Library Metadata, 15(2), 63-78. https://doi.org/10.1080/19386389.2015.1041853
- Herring describes the Metadata Object Description Schema (MODS) function to the Music Preserved archive and John R. T. Davies jazz collection at the University of York. The case study outlines the METS elements that were used to describe the metadata of works within works, or many musical and nonmusical parts in one piece, such as live recordings with breaks or applause. The need to preserve these recordings is important to capture the historical significance of the event elements within the music performance as a whole. The case study is unique in its use of the MODS element <relatedItem>, which contains the bilevel needed to nest records within records. Herring (2015) describes MODS as “simpler than traditional MARC but richer than Dublin Core. Although it is a general standard, it provides richer descriptive elements than PBCore or the EBU standard” (p. 69). The decision to use the particular MODS element to capture the appropriate metadata of the digital files is important in this case study, along with the other digital resources in the archive, working to paint a bigger contextual picture.
Melvin, D. O. (2014). Managing metadata interoperability within audio preservation framework: Integrating the metadata encoding & transmission standard (METS) and multichannel source material into digital library audio collections. Library Philosophy and Practice 1117, 1-29. http://digitalcommons.unl.edu/libphilprac/1117
- As the amount of digital content grows exponentially, users grow in their demands for information or metadata of those digital files. Melvin provides a literature review into the ingestion, organization, and preservation of metadata within a digital library audio collection through Broadcast Wave Format (BWF) and Metadata Encoding Transmission Standard (METS). Both formats contain metadata elements, with BWF being a bit more limited in scope and METS containing administrative, descriptive, and technical metadata. While BWF is introduced in the article, the focus is on METS and how those metadata elements work within the framework standards of Dublin Core and MARC.
Mirhosseini, Z., & Kazemlou, M. (2015). A comparative study of descriptive metadata in audio archives of IRIB using the PBCore metadata standard. Cumhuriyet Science Journal, 36, 3135-3142. https://doi.org/10.17776/csj.69328
- Mirhosseini & Kazemlou explore PBCore as the metadata standard to be used with the music archives in the Islamic Republic of Iran Broadcasting (IRIB). PBCore elements were utilized in a survey that was distributed to eight music archives in Tehran to see if their digital archives metadata could fit into the elements. The research findings show that over 63 percent of the descriptive metadata from the IRIB audio archives could be adapted to PBCore. While the IRIB descriptive metadata does fit into the mold of PBCore, more input will be needed from external sources, such as what extensions to include for international intellectual property data, which would also allow for transparency and access sharing from other repositories.
Orio, N., Snidaro, L., Canazza, S., & Foresti, G.L. (2010). Methodologies and tools for audio digital archives. International Journal on Digital Libraries, 10, 201-220. https://doi.org/10.1007/s00799-010-0060-6
- Orio et al. (2010) state that “a complete access to the audio content cannot be carried out without accessing to the contextual information, that is to all the content-independent information available” (p. 201). In this article, Orio et al. cover aural and visual methods to detect errors in digital audio files and processes for automated extraction of metadata from these files for better access. Audio fingerprinting and watermarking can point out original and playback errors, contextual metadata, such as title and artist, along with recording processing and digital rights information. Visual capture can point out the physical, contextual information surrounding the quality of the source of original recordings. The authors highlight Metadata Encoding and Transmission Standard (METS) and MPEG as appropriate metadata schemas due to their hierarchical structures. The case is convincingly made to use methods that capture all metadata surrounding the original recordings of audio.
Otto, J. (2010). A sound strategy for preservation: Adapting audio engineering society technical metadata for use in multimedia repositories. Cataloging & Classification Quarterly 48(5), 403-422. https://doi.org/10.7282/T31G0NWR
- Otto outlines the Audio Engineering Society’s (AES) draft metadata standard and schema for audio object, AES-X098B, and how this standard can be implemented in Rutgers University Libraries to digital objects for enhancement of technical metadata. Because digital formats are still very new in the context of all media, knowing what technical information to capture for future access can be challenging. Otto also explains how AES-X098B is a flexible, technical standard that fills in some granularity gaps that Metadata Encoding and Transmission Standard (METS), PBCore, or PREMIS can miss. Throughout this project, Rutgers worked with professionals in the audiovisual communities to establish vocabularies to use within the standard, while also creating internal application to apply the standard to digital objects. A post project autopsy of creation and workflow analysis could provide extra insight into the benefits after utilization.
Surles, E. (2018). Sound practice: Exploring DACS compliance in archival description of music recordings. International Association of Sound and Audiovisual Archives (IASA) 49, 43-59. http://journal.iasa-web.org/pubs/article/view/64/48
- Describing Archives: A Content Standard (DACS) is frequently used as the descriptive standard in archival collections, but not always in sound archival collections. While DACS can be useful with its flexible elements in finding aids, the standard does not offer direct guidance in regard to music recordings, but Surles (2018) “recommends supplementary standards without indicating how to incorporate them in a DACS-compliant finding aid” (p. 43). Surles provides an overview of the challenges faced when describing music collections in archives and evaluates 20 online finding aids to see how DACS has been utilized for those collections. The results showed that none of the finding aids were completely DACS compliant. The study reiterates that DACS may not be the best overall descriptive standard or metadata schema for use with music collections in an archive.
Waters, J., & Allen, R. (2010). Music metadata in a new key: Metadata and annotation for music in a digital world. Journal of Library Metadata, 10(4), 238-256. https://doi.org/10.1080/19386389.2010.524863
- Waters & Allen provide an overview of descriptive, structural, and administrative metadata elements that could be contained in XML for digital music objects. Bibliographic metadata standards, such as, The Functional Requirements for Bibliographic Records (FRBR), Dublin Core, and Metadata Object Description Scheme (MODS), are covered in the article, along with multimedia standards, such as, various versions of Motion Picture Experts Group (MPEG) standards and Multimedia Encoding and Transmission Standard (METS). Waters & Allen shed light on the uses of metadata and digital tags for music in 2010. Also, the authors provide some insight into the future challenges surrounding contextual metadata and interoperability that researchers may note have already come to light or have been addressed in other research.