File Formats

File formats define how information is encoded a digital file. File formats can be standardised, open, well documented and possibly associated with a reference implementation for how software should interact with files of that format. But file formats are not always as clearly defined, and format specifications are not always closely followed by the software that implements them. Understanding file formats and how we interact with them in practice can be therefore be critical to ensuring effective digital preservation. This page provides some guidance on the best sources of information for further information on file formats. For a broad introduction to file formats and digital preservation, see the DPC Handbook:

See also, the DPC Technology Watch Reports:

Understanding the broader challenges associated with file formats

A number of pieces of work have sought to develop methods of assessing the appropriateness of particular file formats for preservation, typically based on high level criteria. This includes the now somewhat dated DPC Tech Watch report. More recent thinking has begun to move away from this approach, due to the need to base decisions on practical experiences with working with file formats and software:

Precision and completeness are not qualties that can always be associated with file format specifications, and this lies problem lies at the root of many preservation challenges:

Examples from the Information Security community, while not typical of the preservation challenges we are likely to experience, illustrate the flexibility in many file format specifications:

File format identification

Applying a specialist software tool to identify the formats of files to be preserved is typically one of the first steps in a digital preservation work flow. Read more about File format identification here...

Seeking reference information and guidance on specific formats

There are a number of excellent sources of information to assist digital preservationists. Wikipedia remains a good place to start for high level information about a particular file format. The associated Wikidata is the also the focus of the latest effort to build a collaborative registry of file format information.

A small number of libraries and archives have been developing their own preservation focused assessments of particular file formats. These provide useful guidance on the risks associated with common file formats, and approaches for addressing them. They are located in different places on the web, but are linked from the home of a loose collaboration between these organisations on the DPC Wiki:

The Just Solve wiki provides a community driven site for gathering information about different file formats and is particularly good for discovering information on more obscure file formats:

Child Tags

PDFPDF/AJPEG2000

Parent Tags

Issues

Articles

Digital Preservation Awards 2022 Finalists Announced: Dutch Digital Heritage Network Award for Teaching and Communications

The next set of finalists for the prestigious Digital Preservation Awards 2022 is revealed, with three more creative initiatives being recognised for their achievements! Finalists for the Award for Teaching and Communications, sponsored by Sound & Vision and the National Archives of the Netherlands on behalf of the Dutch Digital Heritage Network, are (in no particular order):

Read More


How is DPC RAM being used?

How is DPC RAM being used? Here are some examples of how DPC RAM has been used by members of the community to help benchmark their progress in digital preservation. If you have a example of DPC RAM in action that you would like to share please contact us: Assessing where we are with digital preservation - a blog post from Fabiana Barticioti, Digital Assets Manager at LSE Library From 'starting digital preservation' to 'business as usual' - a blog post from Anna McNally, Senior...

Read More


Digital Preservation of Community Archives: Breaking down barriers to digital preservation through training

Dr Deborah Thorpe is Education and Outreach Manager for the Digital Repository of Ireland This autumn, the Digital Repository of Ireland (DRI) held an online introductory training programme in digital preservation for our members, with a particular focus on the training and community-building needs of community archivists. This course has been helping with breaking down barriers to digital preservation, by making topics such as appraising your digital collections for preservation,...

Read More


Understanding User Needs: Technology Watch Guidance Note on Access to digital collections available on general release

The DPC has released the next in its series of Technology Watch Guidance Notes on Access to digital collections. The new Guidance Note entitled Understanding User Needs by Sharon McMeekin is available to the digital preservation community from today. Understanding User Needs provides a pragmatic approach to conducting and interpreting a user needs analysis, whilst highlighting the importance and significance of the results.

Read More


How to level up with DPC RAM

The following resources are provided to help you to move forward with digital preservation after carrying out a RAM assessment.   The resources are organized using the 11  sections of DPC RAM so you can quickly skip to the relevant resources if you want to focus on making progress in a particular area.    We are keen that the community learns from the work that others have carried out, so:  Please contact us if you have suggestions of other resources to add to...

Read More


Scroll to top