Rachel MacGregor

Rachel MacGregor

Last updated on 2 February 2021

Rachel MacGregor is the Digital Archivist at Lancaster University


We are members of the Digital Preservation Coalition which is a members organisation which exists to secure our digital legacy. Members include businesses, HE institutions, funding bodies, national heritage and cultural organisations and are drawn from every continent.

Last week all members were invited to the annual un-conference where we come together not only to share experiences and network but also to help set the Digital Preservation Coalition’s training and development agenda for the year ahead. The ideas is that members have the opportunity to raise the issues which really matter to them and then discuss how the DPC can take action to move forward on these issues.

The agenda for is set on the day and full members are invited to give a three minute presentation of their successes, challenges and future opportunities.  Listening to the reports it was clear that there were themes common to all, whatever stage of maturity they are at. 

So what were the common themes which came out of the day?

Challenges

Many people shared their efforts to meet the challenge of preserving specific “types” of data:

  • Software and software environments
  • Email
  • Audio visual materials
  • Sensitive data

The preservation of Research Data, which usually means a huge range of data types also came up.  Here at Lancaster the preservation of Research Data has been our focus so we are well aware of the challenges we face but it’s great to be able to share them and know that there is a community out there working on this together.  We have also been engaging with software preservation and looking into ways on which we can support our researchers who create software. There is really encouraging work being done by the Software Sustainability Institute and here at Lancaster we have been running various initiatives including inviting Neil Chue Hong of the SSI to speak at our Data Conversation and presenting at our own Research Software Forum.

There were quite a few organisational challenges discussed such as:

  • Huge rise in quantity of data and difficulties in predicting the growth rate.
  • Resources either staying the same or being cut in the face of the growth in data
  • Sustaining work beyond a project level - moving it on to business as usual
  • Dealing with organisational restructures

Finding the right tools for the job

These challenges require robust strategies and planning to tackle.  Again the approaches we need to develop can be done as a community. Here at Lancaster we are developing a tool called DMAonline as part of the Jisc Research Data Shared Service.  DMAonline has reporting functionality for a variety of research data and scholarly communications outputs but one of the things we are hoping for is that it will be able to provide intelligence (rather than analytics) – it will use machine learning to make suggestions on growth and development and predictions on future use.

We don’t just want to create pretty graphs we want to answer questions; for example predicting growth in storage needs or predicting the growth of the “long tail” of unidentified file formats. It’s an ambitious aim but we are keen to take part in the challenges presented by the long term preservation of digital assets.

Finding the right tools for the job was also mentioned.  I think we would all agree that the tools we currently have are not necessarily the right fit for the job. Often we just need to get on with the job and have to use the tools which are available but sometimes it’s good to take a step back and say - what are we trying to achieve? What is the best way to get there and what should the tools we need look like? I don’t have the technical knowledge to build them but I can work with others - like my team here at Lancaster - to work towards this.

The human problem

One thing that came up was the challenge of getting the data/records/archives as quickly as possible ie before they are lots/altered/deleted/degraded/ended up on a corrupted cd.  Some of this challenge is technical ie having simple easy-to-use systems which people will engage with and will encourage good data practices.  However more of the challenge is about getting people to engage with the process in the first place so that vital data, metadata and contextual information is not lost over the passage of time.

Successes

It was great to hear about many successes with many institutions implementing a fully functional preservation system. Other institutions had successes getting digital preservation on the agenda with senior management.  One institution mentioned that they argues by not investing in digital preservation and training they would fall behind competitors. Another mentioned getting digital preservation recognised on a risk register. These are all significant achievements and show that individual institutions are moving forward and making progress.  

It was also really good to hear about some specific projects such as the work done by the National Library of Scotland on converting tif files to jp2 or the British Library’s work to keep up with the challenge of preserving digital formats which form part of the collections of a legal deposit library.  This work will also benefit other institutions tackling similar problems.

Moving on

I really hope this day leads to relevant and targeted planning and support for all DPC members and I also hope it helps connect us as a community to tackle the common challenges which we all face.  The Digital Preservation Coalition also provide lots of resources for the wider non-member community so it’s a great way of coordinating development work and sharing expertise to help foster a real community of practice.


Scroll to top