Garth Stewart

Garth Stewart

Last updated on 10 January 2020

Garth Stewart is Head of Digital Records Unit at National Records of Scotland

Anyone who has ever moved home can probably agree that it is at once a very exciting, yet stressful experience. Fitting your personal belongings into cardboard boxes can be a real mission; delivery vans can sometimes turn up at the wrong address, or not at all; and once you do manage to transport everything across town and country to your new gaff and unpack everything, inevitably something goes missing in transit. In short, moving big collections of stuff significantly increases the risk of loss.

The same principle applies when moving collections of digital materials. However, as covered in the DPC Handbook, collection migrations are an essential fact-of-life given that every single IT storage system will eventually become obsolete. When this point is reached – or ideally before it - it’s up to digital preservation professionals to ensure that any valuable materials on that system are migrated safely to a more modern storage environment, and that the success of this migration – or otherwise - is documented appropriately.

Context for our collection migration

Two years ago we blogged about National Records of Scotland’s Interim Repository Solution, which is managed by our Digital Records Unit (DRU). As we shape our plans for digital archiving at NRS, the Interim Solution continues to provide us with the baseline digital storage capacity to accept and preserve a copy of the digital archival accessions transferred by our depositors.

In 2018, NRS IT Services team initiated the Common Operating Platform (COP) programme to address a number of legacy technology challenges faced by NRS.  The primary outcome was to transition the services to a single, consolidated, future ready set of IT infrastructure components. The resultant platform was badged `NRScotland`.

This project included the closure of the Storage Area Network (SAN) on which the Interim Solution was using to store data – a platform referred to as ‘FER’. This meant we would need to migrate our archival collection from FER to its replacement NRScotland: with this in mind we entered into conversations with our IT colleagues in Spring 2019 to work how this migration process would work.

Our requirement

Our biggest priority was to ensure that the completeness (files and folders) and fixity (bits and bytes) of our collection remained the same, both before and after transfer to NRScotland. As a matter of course, all objects in our digital archive are provided with hash values (A.K.A. checksums) once they enter our care – we use DROID to create hash values where a depositor has not provided them. Our requirement was that all objects being migrated must have their checksums checked prior and after migration to NRScotland, to ensure they all made it across in good order. We wanted to confirm simple, transparent chain-of-custody of every digital object. The challenge was: how could we check these hash values at scale to provide us with this assurance?

A few processes were ruled out – manual sampling of migrated objects was rejected as this would not provide the cross-collection integrity check that we needed. Similarly, the use of Commvault, a market-leading data management tool was proposed to provide the documented assurance trail. However, this too was unsuitable as the way our storage infrastructure was configured prevented this tool from integrating our pre-existing hash values into its processing workflow. Commvault was used to securely transfer the data, but a different solution was needed to provide the level of documented assurance and evidence.

In the end our solution was a simple Powershell script crafted by our IT colleagues, which generated a series of manifests of the collection ‘before’ and ‘after’ its migration to NRScotland. These manifests included the crucial components for fixity testing:

  • Full name of every file in the digital archive
  • File size per file
  • MD5 checksum per file

A test of this scripting process took place in August 2019 – we pasted the manifests generated by the script into Microsoft Excel for checking using a simple Excel formula. Our only inconvenience with this method was that Excel imposes a limit of roughly 1,000,000 rows per sheet, meaning our results split across two sheets to cover all 1.6m objects in the collection. Considering how many things could have gone wrong with the migration process, this was a very acceptable outcome!

Garth 1a

Figure 1: layout of Microsoft Excel spreadsheet containing manifest data for pre and post migration

Setting our parameters

We now had a working option that met our requirement. We now needed to determine whether we could migrate our entire digital archive as part of this process.

By August 2019, we had a number of outstanding digital accessions which had not yet been copied to FER – mostly recent transfers by depositors. The question was: should we copy these to FER prior to migration, or should we wait and copy these directly to NRScotland post-migration?

Copying to FER presented challenges – this platform could only be accessed from a separate building outside of normal office hours, for example. However, we were keen to copy as much into FER pre-migration as possible – having as much of our collection in one place would make the migration process more robust and support its future management. Due to the diligence and flexibility of DRU staff we managed to copy the vast majority of outstanding accessions over to FER prior to migration.

A Better Interim Solution

In October 2019 the copy of the digital archive collection held on FER was successfully migrated to NRScotland. Whilst this process was undertaken, the copy on FER was retained until completeness and fixity checking were successfully completed. Results of these checks were stored in our corporate ERDMS as an essential record of the collection’s chain of custody. As a significant bonus, we also had a new terminal installed in our office in West Register House, which provides us with local access into the collection on NRScotland. This has made routine checks for preservation and access infinitely more efficient. We also have much greater potential for expanding storage capacity for the collection on this platform.

Garth 2

Figure 2: new workstation to NRScotland (on R) in West Register House. On L is our standalone quarantine machine.

As importantly however, this successful collaboration has created stronger working relations between the DRU  and our IT colleagues, which will be of immense benefit to our future, exciting work on digital archiving.

Scroll to top