Libor Coufal is Assistant Director for Digital Preservation at the National Library of Australia

We are very mindful that it has been (not quite all, but mostly) quiet on the NLA communication front in the last several years, while we have busily worked on implementing our digital preservation program. Our attendance at this year’s iPres (our first since 2014) was a great opportunity to pause and reflect on the progress we have made. We would like to update the community on what we have been up to and the things we have achieved.

Where we started

The oldest known born-digital items in the NLA collection are from the beginning of the 1980s but digital collecting really took off with the launch of the Library’s trailblazing web archiving program in the second half of the 1990s. This was soon followed by mass digitisation of our physical collections, including newspapers, and an extensive collection of OH&F audio recordings, dating back to the 1950s.

The Library recognized early on the need to preserve its digital collection and built a first generation of digital library infrastructure including in-house mass storage and bespoke systems and tools for managing digital collections.

In 2011, we embarked on a major award-winning, six-year program to replace the aging digital library infrastructure and improve our digital collecting, preservation and delivery capabilities. This included procuring Preservica as an addition to our digital preservation ecosystem to safeguard our born-digital material. The Digital Library Infrastructure Replacement program coincided with the introduction of a new digital legal-deposit legislation. This eventually led to development of a new National eDeposit system, a nation-wide collaboration with Australian State and Territory Libraries.

To date, the digital collection has grown to almost 3 PB (single copy), which represents a whopping 15+ billion files (or 154 million files if the web archive is counted only at the WARC-container level) and comprises digital publications (including web content, music scores and maps), oral history and folklore recordings, digital photographs and personal archives, as well as digitised physical collections.

How far we have got

At NLA, we see digital preservation as a cross-library function which is part of a holistic approach to collection care and preservation, regardless of the format. We now have an established digital preservation program in place which is governed by a three-year Digital Preservation Strategic Plan (currently on its second iteration) and overseen by a group of directors representing the main Library stakeholders.

The majority of our digital collection is managed:

  • it is bit-level preserved as per our Bit-level preservation policy, i.e. stored in a managed storage, replicated, with regular fixity checks;

  • we regularly ingest new digital-collection material into our digital preservation ecosystem;

  • and we have implemented digital-preservation processes to pro-actively manage the collection, i.e. verification of file-format composition of the collection and regular reporting on level of support for file formats (AKA technology watch) in the collection, based on empirical evidence.

We are starting to look at using the output of our level of support monitoring process to inform prioritisation, planning and resourcing for material which needs a preservation action. This will include creating preservation plans for file formats which are currently not, or not sufficiently, supported.

Only a relatively small but important part of the digital collection is currently unmanaged, meaning the material has not yet been processed from the original carrier (e.g., a floppy disk) into the preservation system and managed storage. We are looking at prioritising and getting on top of this backlog as an urgent matter.

Most of our digital collection is publicly discoverable and available online through Trove, on- or off-site, depending on copyright or access conditions. This includes the complete Australian Web Archive, the OH&F collection, as well as all digitized and NED material. For a small part of the collection which is currently not available through Trove, we are investigating other interim solutions such as EaaSI.

That means we are very close to having an end-to-end digital preservation and permanent access workflow. This progress would have not been possible without the foresight, the vision and the groundwork of our predecessors, or without the help and support of our many current colleagues across the Library.

Of course, the work is far from being over. Digital preservation is a never-ending task, at least for the foreseeable future – we must balance work on adding new workflows and processes as the digital collection grows and we deal with new content types and challenges, with keeping up the ones already in place, with the modest resources we have. That requires pragmatism, strict prioritisation, risk-management and focus on efficiency, scalability and continuous improvement.


Scroll to top