Tom Ruane

Tom Ruane

Last updated on 7 June 2021

Tom Ruane is a Preservation Audio Engineer at The British Library. He recently completed the DSM6010 Digital Preservation course at Aberystwyth University with support from the DPC’s Career Development Fund, which is funded by DPC Supporters

The British Library is home to the UK’s Sound Archive, which holds over 6.5 million at-risk sound recordings, held on formats from the dawn of sound recording technology, all the way through to contemporary digital carriers.  In my role as a Preservation Audio Engineer, I’m part of a dedicated team working to preserve these recordings and ensure ongoing access for the nation.

During the first lockdown in 2020, I applied for the DPC Career Development Fund and was kindly awarded a scholarship to study DSM6010: Digital Preservation, a six-month, postgraduate course at Aberystwyth University.  I completed the course at the end of February 2021 and found its content to be hugely beneficial, giving me a solid grounding in the principles of digital preservation and a much broader, holistic view of data curation.

One point that struck me during my studies, was that the development of digital preservation as an organised discipline, parallels that of audio preservation in many ways, a field that I have spent the majority of my career in.  In retrospect, this isn’t really that surprising as digital preservation is principally format agnostic, and that any information preservation dependent on technology, faces the same risks - losing the ability to access the information and interpret it correctly.  

So in this post, I thought it might be of interest to DPC members to provide an overview of how we, in the Technical Department of The British Library Sound Archive, currently manage the data we create and prepare it for ingest into the Library’s centrally managed digital repository for long-term preservation; a process that has taken several years to develop and continues to be iterated upon.  

In audio preservation, there are two major risk factors to the information stored on legacy carriers: the degradation of the physical format and obsolescence of the access technology.  Many of the processes we developed over the years were born out of necessity - it is commonly believed within the audio community that we have about 10 years before it becomes impractical, if not impossible, to access content on the majority of audio carriers.  This understandably has always put priority on ‘getting the content off’ the physical format, but over time as our workflows developed and scaled with technology, we were able to generate huge amounts of data in a very short period of time.  So what to do with it all?  Simply having a digitised version doesn’t constitute preservation.  Therefore, the management and curation of the data took on a greater role and began to inform fundamental parts of our workflow. 

Original hardware, emulation environments and new technological processes all play a vital role in rendering and extracting audio information accurately; but migration to a stable, file-based format is considered the only viable way of ensuring ongoing access.  This paradigm also applies, ironically, to earlier ‘preservation’ formats such PCM encoded, Betamax tape and CD-R, which are now themselves at risk.

Migration begins by capturing the audio information from its source, encoded as a high-resolution, PCM WAVE file, with metadata embedded in the file header, stating the date/time, engineer and encoding software.  After capture, checksums are generated for the files and verified throughout the workflow.  The files are then added to an Information Package, linked to the Library’s catalogue for access and discovery.  File verification and format profiling is undertaken via a suite of tools, including DROID, JHOVE, and MediaInfo; assuming the uploaded file meets the ‘allowlist’ conditions, the necessary output is inserted into the associated METS document.  

From here, each file’s relationship to the physical source can be described.  For example:

  • File 1, derived from, Tape 1, Side 1
  • File 2, derived from, Tape 1, Side 2

Associated files, such as label scans, transcripts etc. can also be linked.  However, the focus of our preservation work is the audio, so this is not currently a routine part of the workflow.

Once the physical structure is established, the complete Process History of the object is described using controlled vocabulary; this includes the transfer equipment, replay parameters, hardware/software processing, and even the individual inputs-outputs, linking each piece of equipment in the signal chain.

The Information Package is passed to our Cataloguing team, who describe the audio content of each recording and define its Logical Structure, linking each audio section to the associated catalogue record, with timecode.  There is also provision to define varying access controls, determined by things such as rights ownership, sensitivity or location.  The Information Package is then submitted to the Library’s central repository for ingest, at which point it is transformed into an Archive Information Package (AIP) for long-term preservation. 

Access requests are completed according to the parameters of the Information Package, and served to the user through the Library’s, IIIF compliant, Universal Player - an extension of existing ‘Viewers’ that enables the rendering of time-based media content.  

The current workflow has proven to be very robust - enabling the Library to actively preserve hundreds of thousands of at-risk recordings.  So now that we’re able to comfortably preserve the data we produce, what next?  Through my studies I now have a better understanding that ‘preservation’ doesn’t end with a file in a repository, lifecycle data management and curation is an ongoing process - migration pathways, information authentication, interpretation and reuse - these are vital issues that will continue to be assessed in order to ensure the information we hold is accessible and understandable.  The British Library has the expertise within the organisation to address these issues; however, I now feel that I can contribute to the conversation and bring with my own knowledge and insight to the table.

Scroll to top