Research Data Published through Repositories

 Vulnerable small

Research data published through digital repositories or other services providers with specialist skills to manage the data and an ongoing commitment to ensure preservation.

Digital Species: Research Outputs

Trend in 2022:

reduced riskMaterial improvement

Consensus Decision

Added to List: 2019

Trend in 2023:

reduced riskMaterial improvement

Previously: Vulnerable

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on many people and sectors.

Effort to Preserve | Inevitability

It would require a small effort to preserve materials in this group, requiring the application of proven tools and techniques.

Examples

Recognized data repositories in specialist disciplines; institutional data repositories in subject specialist centres and partnerships.

‘Endangered’ in the Presence of Aggravating Conditions

Lack of long-term commitment; lack of user community; lack of visibility to potential depositors; lack of institutional commitment; insufficient documentation.

‘Lower Risk’ in the Presence of Good Practice

Certification and documented good practice; effective documentation requirements for depositors; proven financial sustainability; skilled staff including professionalising disciplinary and general data stewardship offering a clear career option; participation in the digital preservation community; research data management training by repositories and research funders offered to depositors, in particular new career researchers.

2023 Review

This entry was added in 2019 as a separate entry, but it was previously introduced in 2017 under ‘Published research outputs,’ though without explicit reference to the capacity of the repository infrastructure. In 2019, the Jury split the entry into a range of contexts for research outputs, including this addition. It was classified as Vulnerable; the preservation of research data published through a well-founded repository with the capacity and commitment to ensure preservation and capability through their own professional development activities makes it a 'lower risk' outcome for research data.

The 2021 Jury agreed with this classification but commented on the improvements and initiatives towards the preservation of research data and outputs, leading to a trend towards reduced risk.

The 2022 Taskforce agreed on a trend towards reduced risk based on material improvement over the last year that have not only offered examples of good research data management and preservation practices but also suggest a significant shift towards culture of change and collaboration across different research communities and stakeholders. These include (but are not limited to) improvements and initiatives by the European Open Science Cloud (EOSC), Science Europe, Research Data Alliance (RDA), Digital Curation Centre (DCC) and related projects on the preservation of research data and outputs.

The 2023 Council agreed with the Vulnerable classification and noted that there was a trend towards reduced risk due to increasing research data management and engagement activity by libraries, which should result in increasing amounts of datasets being deposited. The 2023 council did also note it would be useful to see empirical data of depositing trends to assess this.

Additional Comments

A key consideration with this entry is whether the data repository is integrated with a preservation system to facilitate long term access and usability of datasets.

The loss of tools, data or services within this group would impact on people and sectors around the world. Particularly those involved with reproducibility and those wishing to use the datasets for further research.

Although there have been improvements in current practice, policies and workflows, there is still a significant corpus of information that was deposited before these improvements came into force. It is unlikely that there will be the time, will or resources to bring this information up to current standards.

Creating additional preservation metadata to research data holdings may help render data more robust in the long term, where using a preservation system is not an option. With an emphasis on environmental sustainability, some repositories hesitate mandating additional copies of large datasets which may be in the region of hundreds of terabytes, as this adds to both storage cost and carbon footprint, especially when capturing and preserving the research methodology would enable recreating the dataset.

Case Studies or Examples:

See also:


Scroll to top