File Formats

File formats define how information is encoded a digital file. File formats can be standardised, open, well documented and possibly associated with a reference implementation for how software should interact with files of that format. But file formats are not always as clearly defined, and format specifications are not always closely followed by the software that implements them. Understanding file formats and how we interact with them in practice can be therefore be critical to ensuring effective digital preservation. This page provides some guidance on the best sources of information for further information on file formats. For a broad introduction to file formats and digital preservation, see the DPC Handbook:

See also, the DPC Technology Watch Reports:

Understanding the broader challenges associated with file formats

A number of pieces of work have sought to develop methods of assessing the appropriateness of particular file formats for preservation, typically based on high level criteria. This includes the now somewhat dated DPC Tech Watch report. More recent thinking has begun to move away from this approach, due to the need to base decisions on practical experiences with working with file formats and software:

Precision and completeness are not qualties that can always be associated with file format specifications, and this lies problem lies at the root of many preservation challenges:

Examples from the Information Security community, while not typical of the preservation challenges we are likely to experience, illustrate the flexibility in many file format specifications:

File format identification

Applying a specialist software tool to identify the formats of files to be preserved is typically one of the first steps in a digital preservation work flow. Read more about File format identification here...

Seeking reference information and guidance on specific formats

There are a number of excellent sources of information to assist digital preservationists. Wikipedia remains a good place to start for high level information about a particular file format. The associated Wikidata is the also the focus of the latest effort to build a collaborative registry of file format information.

A small number of libraries and archives have been developing their own preservation focused assessments of particular file formats. These provide useful guidance on the risks associated with common file formats, and approaches for addressing them. They are located in different places on the web, but are linked from the home of a loose collaboration between these organisations on the DPC Wiki:

The Just Solve wiki provides a community driven site for gathering information about different file formats and is particularly good for discovering information on more obscure file formats:

Child Tags

PDFPDF/AJPEG2000

Parent Tags

Issues

Articles

File format recommendations - I wouldn’t say they are unacceptable, but I wouldn’t recommend them either

Last week I joined a webinar entitled “A Comparison of Recommended File Formats and the New Dutch Method for File Format Assessment”. It’s great to see the outcomes of this collaborative work, and it’s clear that it has already played an important role in bringing out some key themes in the preservation approaches of various organizations. But I felt that a number of aspects give cause for concern. The collation of file format policies has highlighted some approaches that I believe should be...

Read More


Title: Preservation Digitisation Project – Digitising the Tasmanian Archives audio visual collection

Karin Haveman is Acting Manager Government Archives and Preservation at the Tasmanian Archives and Digitisation Services Coordinator In February 2021, Libraries Tasmania launched the Preservation Digitisation Project – a major collaborative project that brings together Digitisation Services, System Support and Delivery, Government Archives, and the Community Archives teams. The aim of this project is to digitise our Tasmanian film, sound, and video collections for long-term...

Read More


Digital preservation at the National Library of Australia

Libor Coufal is Assistant Director for Digital Preservation at the National Library of Australia We are very mindful that it has been (not quite all, but mostly) quiet on the NLA communication front in the last several years, while we have busily worked on implementing our digital preservation program. Our attendance at this year’s iPres (our first since 2014) was a great opportunity to pause and reflect on the progress we have made. We would like to update the community on what we have...

Read More


Digital Preservation Awards 2022 Finalists Announced: Dutch Digital Heritage Network Award for Teaching and Communications

The next set of finalists for the prestigious Digital Preservation Awards 2022 is revealed, with three more creative initiatives being recognised for their achievements! Finalists for the Award for Teaching and Communications, sponsored by Sound & Vision and the National Archives of the Netherlands on behalf of the Dutch Digital Heritage Network, are (in no particular order):

Read More


How is DPC RAM being used?

How is DPC RAM being used? Here are some examples of how DPC RAM has been used by members of the community to help benchmark their progress in digital preservation. If you have a example of DPC RAM in action that you would like to share please contact us: Assessing where we are with digital preservation - a blog post from Fabiana Barticioti, Digital Assets Manager at LSE Library From 'starting digital preservation' to 'business as usual' - a blog post from Anna McNally, Senior...

Read More


Digital Preservation Awards 2022

The search for the very best work in digital preservation has begun again this year, with the launch of the Digital Preservation Awards 2022 as part of the Digital Preservation Coalition's 20th Anniversary celebrations! Organized by the Digital Preservation Coalition (DPC) every two years, the prestigious Digital Preservation Awards is the most prominent celebration of achievement for those people and organisations who have made significant contributions towards a sustainable future for our...

Read More


Digital Preservation of Community Archives: Breaking down barriers to digital preservation through training

Dr Deborah Thorpe is Education and Outreach Manager for the Digital Repository of Ireland This autumn, the Digital Repository of Ireland (DRI) held an online introductory training programme in digital preservation for our members, with a particular focus on the training and community-building needs of community archivists. This course has been helping with breaking down barriers to digital preservation, by making topics such as appraising your digital collections for preservation,...

Read More


Understanding User Needs: Technology Watch Guidance Note on Access to digital collections available on general release

The DPC has released the next in its series of Technology Watch Guidance Notes on Access to digital collections. The new Guidance Note entitled Understanding User Needs by Sharon McMeekin is available to the digital preservation community from today. Understanding User Needs provides a pragmatic approach to conducting and interpreting a user needs analysis, whilst highlighting the importance and significance of the results.

Read More


How to level up with DPC RAM

The following resources are provided to help you to move forward with digital preservation after carrying out a RAM assessment.   The resources are organized using the 11  sections of DPC RAM so you can quickly skip to the relevant resources if you want to focus on making progress in a particular area.    We are keen that the community learns from the work that others have carried out, so:  Please contact us if you have suggestions of other resources to add to...

Read More


Scroll to top