Kate Murray

Kate Murray

Last updated on 4 November 2020

Authors – all from the Library of Congress: Kate Murray (Digital Projects Coordinator), Trevor Owens (Head, Digital Content Management Section), Marcus Nappier (Digital Collections Specialist), Ted Westervelt (Chief, US/Anglo Division) and Abbie Grotke (Assistant Head, Digital Content Management Section & Web Archiving Program)

The Library of Congress relies on a network of policy documents, good practice and specifications that govern the Library's digital collection management program compiled in the publicly available Digital Collections Management Compendium (DCMC). The Compendium brings together the business guidance from across the Library’s many departments related to digital formats, inventory and custody and finally, access. In October of 2019, the Library of Congress launched a public version of the DCMC, which presents digital collection management policies in three main areas: digital formats, inventory and custody, and access. The section for digital formats includes a summary of how the Library’s Recommended Formats Statement can be applied, as well as overarching statements about how we manage various digital preservation issues for digital formats, including the preservation of content “as received” over time, the creation of digital surrogates, and inventorying of format types. As such, the ongoing maintenance and development of the RFS is critical to the Library’s overall digital collection management policy and practice.

The Recommended Formats Statement (RFS) introduced significant changes in 2020. The RFS guides acquisitions librarians from the Library of Congress and beyond in collecting works with an eye toward long-term sustainability and access. The new version incorporates major changes reflecting the way people today create and use content — geospatial works, three-dimensional designs, musical scores and more as well as a structured method to evaluate digital formats based on global/community sustainability criteria and local/institutional factors that has allowed us to establish clearer definitions of ‘Preferred’ and ‘Acceptable’ digital file formats for LC. 

Given the value provided by the RFS since its first issue in 2014, this past year the Library has undertaken a more intense review and revision process than its usual annual update.  For this, the Library called in a broad array of subject matter and technical experts in all types of creative works.  These technical teams were given months in which to explore their areas and recent developments in them.  At the end of this, they were able to make foundational improvements, encouraging to the ongoing and long-term usefulness of the RFS.

In the past, some of the content categories were determined by practices of digitizing materials, which had generally similar processes in the analog world but are diverging in the digital realm. For example, the digitization of textual works and musical works follows similar imaging workflows, but for born digital works there are often drastically different format considerations. Likewise, some categories see more crossover in the digital realm. For example, geospatial digital materials often combine characteristics of still images and datasets in the digital representation of maps and other geographical representations. In light of changes like these, the newest version of the RFS has provided for a reimagining of the categories of creative works as used by people today: adding specific ones for Musical Scores; GIS, Geospatial and Non-GIS Cartographic; and Design and 3D.

The new and improved RFS also includes a more structured and transparent evaluation system for identifying the digital file formats prioritized in it.  This change provides the RFS with a more sustainable analytical basis, founded on a “Level of Service” model by defining more clearly, why certain formats may be designated “preferred” or “acceptable.” This work is grounded in an understanding of similar models from peer institutions, such as the National Archives and Records Administration, Library and Archives Canada, National Library of Australia, large research libraries, and others. It draws upon already established factors in the community, including the seven “sustainability factors” that the Library already uses for the evaluation of digital formats: disclosure, adoption, transparency, self-documentation, external dependencies, impact of patents, and technical protection mechanisms. Alongside this local factors, such as staff and systems capacity for working with and understanding particular formats, are also considered. This common template for all content categories increases consistency for the designation of digital formats, providing the clarity and transparency needed for both current users and for the development of future iterations of the RFS.

With these changes, the Library is proud to present the “RFS 2.0” for the use of all, internal and external to the Library of Congress, who have an interest in the preservation and ongoing use of the creative works which form the collection of this and many other cultural institutions.

Another milestone for 2020 is that this year marks the 20th anniversary of web archiving at the Library of Congress. It was in 2000 that the Library of Congress embarked on a web preservation pilot project called MINERVA, which eventually became the Library’s web archiving program. While plans for an in-person event to celebrate the occasion were abandoned due to Covid-19, the anniversary has allowed us to reflect on the first twenty years of the program through a number of blog posts on The Signal.


Scroll to top