Endangered large

Closed research data sets produced and documented in accordance with good practice and simply appended to a journal article or transferred to a repository which does not have sufficient subject-matter expertise or funding commitment to ensure reliable or ongoing preservation for the long term.

Group: Research Outputs

Trend in 2021:

Consensus Decision

Added to List: 2019

reduced riskTrend towards reduced risk

Previous classification: Endangered


Trend in 2022: 


reduced riskMaterial improvement


Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve

It would require a small effort to prevent losses in this group, such as the deployment of proven preservation tools or techniques.


Data sets added to papers in repositories that are designed primarily for papers; electronic journals offering data sets without obvious preservation capacity; institutional repositories servicing highly complex scientific data sets with insufficient subject-matter expertise.

‘Critically Endangered’ in the Presence of Aggravating Conditions

Unstable funding or revenues; poorly designed migration or normalization processes; poorly formed ingest and quality assurance procedures; rapid churn of staff; incoherent patterns of subject matter; lack of domain knowledge; no or very small numbers of users; weak or absent collecting policy; deposit to ensure minimal compliance with funder mandate; limited or dysfunctional data management planning.

‘Vulnerable’ in the Presence of Good Practice

Clear preservation planning; repository development roadmap; ability to transfer collections or share metadata with subject repositories or portals; strong user base; demonstrable re-use of data; clear collecting policy; data management planning early in data lifecycle.

2021 Jury Review

This 2019 entry was previously introduced in 2017 under ‘Research Data’, though without explicit reference to the research data appended to journal articles. In 2019, the Jury split the ‘Research Data’ entry into a range of contexts for research outputs including this addition. The entry draws attention to services which take upon themselves commitments to preserve research data, but which may not be able to deliver those promises through lack of capability. The 2021 Jury agreed with the endangered classification but commented on the improvements and initiatives towards preservation of research data outputs, with good practice documentation and replication in this space (e.g. collaborations with publishers and repositories, LOCKSS,CLOCKS, etc). For these reasons the 2021 trend is towards reduced risk.

2022 Trend

The 2022 Taskforce agreed on a trend towards reduced risk based on material improvement over the last year that have not only offered examples of good research data management and preservation practices but also suggest a significant shift towards culture of change and collaboration across different research communities and stakeholders. These include (but are not limited to) improvements and initiatives by the European Open Science Cloud (EOSC), Science Europe, Research Data Alliance (RDA), Digital Curation Centre (DCC) and related projects on the preservation of research data and outputs.

Additional Comments

Research data is complex and has specific requirements for documentation which may only be known to subject matter experts. However well intended, it is risky for institutions to attempt to replicate that level of expertise across all the domains within the institution, and it can be hard for smaller publishers to make commitments to sustain data in the long term.

Case studies or examples:

  • The FAIRsharing Collaboration with DataCite and Publishers. See: McQuilton, P., Sansone, S.A., Cousijn, H., Cannon, M., Chan, W.M., Carnevale, I., Cranston, I., Edmunds, S., Everitt, N. and Ganley, E., (2019) FAIRsharing Collaboration with DataCite and Publishers: Data Repository Selection, Criteria That Matter, online at https://osf.io/m2bce/.

  • Resources and research outputs from the Enhancing Services to Preserve New Forms of Scholarship project, which examined a variety of enhanced eBooks and identified which features can be preserved at scale using tools currently available, online at https://archive.nyu.edu/handle/2451/63332. Of particular note is the recently published guidelines for preserving new forms of scholarship. See: Greenberg, J., Hanson, K., & Verhoff, D. (2021). Guidelines for Preserving New Forms of Scholarship. NYU Libraries. https://doi.org/10.33682/221c-b2xj.

  • The work by the Centre pour la Communication Scientifique Directe (CCSD) of France and the Confederation of Open Access Repositories (COAR) in creating a preprint repository directory which has been relevant to building a user community), online at https://doapr.coar-repositories.org/

Scroll to top