Gareth Cole

Gareth Cole

Last updated on 18 March 2021

Gareth Cole is Research Data Manager at Loughborough University. 


I feel I can pretty confidently say the answer is no. But, things may be changing.

As with many universities worldwide, at Loughborough we are working with our academic community to increasingly make their research data publicly available. This is not without its challenges. Some researchers deposit data because they want to make it openly available, some deposit because they are under a funder or publisher obligation, and some don’t deposit! The range of motivations means that a one size fits all approach isn’t appropriate. However, we have to balance this with our limited staffing resource as there is only so much we can do.

As part of our research data management (RDM) training sessions we rarely talk about digital preservation – certainly not as a phrase or concept. However, we try to encourage our researchers to think about some of the broader practicalities of RDM:

  • Can my collaborators use this data?

  • Could I use this data in five years’ time?

  • Is my research reproducible?

  • Can I share my data? If so, where and how?

  • What are the benefits of sharing my data?

The first two points are important even if the researcher doesn’t plan to share the data via a repository or archive. We tell our researchers that they will always be the first re-users of their data. I’m not in an academic position and finished my PhD back in 2008. However, I am still using the data and material I collected over ten years ago in my current research. Most PhD students who go onto post-docs and lectureships will also continue to use material they collected as part of their PhD studies in future work. They may not even realise they are doing it, but in order to keep this data useable researchers will migrate the data from one format to another. Even those of us who are reliant on Microsoft products have probably moved .xls files to .xlsx or .mdb to .accdb. These have been manual changes (migrations?) rather than automated but have been needed to keep the files useable for our own purposes. Using examples like this can help to change the thought process from “It’s digital so it will be around in 10 years” to “I need to think about how to keep this data useable”. Other RDM principles such as documentation (“Include a Readme file”) and metadata also feed into a similar narrative.

The reproducibility agenda will also contribute to a change in the research culture. Increasingly, researchers are being asked to make their research reproducible. However, one thing that isn’t really mentioned (as far as I am aware – I would be delighted to be wrong here) is reproducible for how long? I’ll leave this hanging, but can we claim that research is “reproducible” if it’s only reproducible for two years, or five years, or ten years? Is reproducibility another tool where we can bring digital preservation to researchers without actually mentioning preservation?! I’ll be exploring some of these points as part of a Loughborough ReproducibiliTEA session in June.

Does it matter that Researchers don’t know what digital preservation is? Probably. Does it matter that researchers don’t (as a general group) think about what file format their data is? Probably. Does this mean that we need to train researchers in digital preservation? Probably. Does this mean we have to call it digital preservation in training sessions? Probably not!

Comments   

#1 Jaana Pinnick 2021-03-19 13:34
I use other expressions such as reusability and digital continuity when talking to researchers. As to how long research needs to be reproducible, I suggest it depends on the type of data. Some types remain valid longer; geoscience data is an example of data with very long validity extending the requirement for the data to remain reusable. I suggested in 2017 (https://doi.org/10.1108/RMJ-04-2017-0009) that "Geoscience is an interpretive discipline working on previous data and hypotheses which must remain available for future research and interpretation. " Fossils are a very physical type of palaeontologica l data, but digital data on their interpretation becomes an object of our preservation work. Do scientists need to know about preservation, as well as geological eras? Most definitely yes. An important part of our role is to raise awareness and educate data creators on the impact of their RDM practices on the longevity, reusability, and indeed reproducibility of digital research data.
Quote

Scroll to top