PRONOM is a file format registry which collects key data about file formats that can be used for the purposes of identification and reference, in support of preservation planning activities for digital records.
The registry, first conceived in 2002, was made publicly available in 2005. As the UK government’s official archive, it is the responsibility of The National Archives (UK) to collect, preserve and make available the public records of the UK government. PRONOM was created when the need was recognised for access to reliable technical information about the nature of the records stored in the digital archive.
In the same way that it is essential for a conservation expert for physical objects to understand what materials they are preserving to apply the correct care, so the digital preservation community needs to understand the characteristics of digital materials, such as knowing what file formats are in their collection and what risks may be associated with different file types in order to understand and mitigate preservation risk.
PRONOM is a foundational tool and the file format identification it facilitates is a critical first step towards creating a digital preservation strategy for a collection. PRONOM is designed to meet one of the fundamental challenges of digital preservation: the provision of definitive, trustworthy information about the technical dependencies of digital objects. PRONOM is a centralised, publicly accessible, and continuously growing source of information on file formats, including software, compression, and character encoding.
The first version of PRONOM was developed by The National Archives’ (UK) digital preservation department in March 2002 under the leadership of Adrian Brown. This was the one of the first components of our approach to digital preservation: a central technical registry capturing and providing the information necessary to ensure digital records can be read by future generations of researchers.
As PRONOM has evolved its original broad vision, encompassing software and hardware dependencies, has shifted and research now focusses on the identification and pertinent information of file formats. PRONOM balances a requirement for reliable, trusted information with continual improvement and internal investment from The National Archives which is one of the main drivers of its longevity and high regard across the digital preservation community.
PRONOM has an inclusive contribution model and regular submissions are received from across the globe, from institutions such as archives, libraries, academia, and even private industry. The National Archives (UK) has recently recruited two full-time File Format Analysts, demonstrating a commitment of resource to continue to grow the registry. PRONOM currently contains information on over 2000 formats and actively invites further contributions.
Following a workshop and user research four core values were developed to support PRONOMs goals. These will shape the future objectives and direction for PRONOM. We wish PRONOM to be community-driven, sustainable, transparent, and accessible.
-
Community-Driven: The PRONOM community is a major contributing factor in PRONOM’s success and longevity, and we want this to continue to guide our work and goals.
-
Sustainable: We want our processes to be as simplified as possible. We aim to avoid specialist skillsets, single points of failure and locked-in platforms. We want PRONOM to be an open-source initiative that can scale flexibly and keep running costs manageable.
-
Transparent: As users place trust in the service, we will be as honest and transparent as possible about our processes and the work we wish to achieve.
-
Accessible: We wish to break down the barriers surrounding digital preservation and make it more accessible to anyone not from a technical background. We believe that anyone can conduct file format research and wish to create more awareness around file format risk.
Since September we have created a permanent PRONOM Research Repository on GitHub. We have produced a PRONOM Starter Pack that breaks down the barriers for a non-technical audience to explore file format research, whilst also containing useful information for a more experienced researcher. We now upload our test signatures weekly online for the community to use and test as well as sharing a publicly accessible spreadsheet that tracks our work between releases. We aim to make our processes more transparent with the publication of our research backlog. We have also produced a blog explaining with more clarity and non-technical language the importance and use of PRONOM.
We are currently working on a new PRONOM platform with a linked graph data back-end which we are aiming to release in the autumn. The new website aims to make submissions easier, be more accessible, and contain new resources and questions that aren’t easy to find on the old website. With the linked graph data approach, we aim to be able to link more naturally with other registries to enhance our entries and create more cohesion between information resources.
The handful of names on this bid is only a small reflection of the people that have worked on PRONOM. They may be the PRONOM team, but they can be better described as current custodians. A twenty-year project from inception to present day should recognise all who have worked on it within The National Archives (UK).
More than that, the PRONOM contributor community includes individuals from over eighty institutions who have contributed their own research to improve the registry’s data. Many people and software creators have developed tools that use PRONOM data, very often as open-source tooling that anybody is welcome to contribute to and improve. Others from the PRONOM community have written blog posts, conference papers and posters, or delivered training sessions on the use of PRONOM or the file format research that underpins PRONOM. It is the wealth of generosity of community effort that has made PRONOM the important registry it is today and those who have supported it over the last twenty years should be rightfully celebrated.