On the 25th of April this year (2018) the European Commission released its Recommendation on access to and preservation of scientific information Brussels, 25.4.2018 C(2018) 2375 final (author resists temptation of side-rant on ISO standard date formats). This work by Mariya Gabriel and Carlos Moedas replaces that by Neelie Kroes (then Commission Vice-President) back in 2012. This revision ”recognises that big data and high-performance computing are changing the way research is performed and knowledge is shared, as part of a transition towards a more efficient and responsive open science". This is particularly timely as the UK Data Service is working with international colleagues on the SERISS project to consider the legal, ethical, quality and archival practice implications of  'new and novel' forms of  big data. 

The paper is not long and it's pretty comprehensible, so you may wish to skip these whimsical musings and got straight to the real thing.

[Squared brackets] below refer to the 12 recommendations while the document itself is divided into nine unnumbered sections (author resists side-rant on machine actionable textual-markup) italicised below. The opening 15 recitals are identified with (parentheses)

Mechanisms and Reporting

[12.] Envisions member states reporting their actions to the Commission in eighteen months, then every two years thereafter. But so does the 2012 version and I must admit the previous review cycle has passed me by somehow, though there was a report in 2015 with an overview by member states. 

Structured coordination of Member States at Union level and follow-up to this Recommendation identifies a need for each member state to coordinate the recommended measures and provide  a contact point to the Commission towards "better definitions of common principles and standards, implementation measures and new ways of disseminating and sharing research results in the European Research Area" [11.]. These contact points will ensure multi-stakeholder dialogue on open science at national, European and international level with an explicit expectation of systemic, gradual changes in research culture across the relevant actors covering "all research outputs from all phases of the research life cycle (data, publications, software, methods, protocols, etc.)"[10.]

Recitals 

The opening 15 recitals provide context, particularly around the Digital Single Market Strategy and the European Cloud Initiative but the critical statement for a DPC audience may be (7): 

"Preservation of scientific research results is in the public interest. [] Mechanisms, infrastructures and software solutions should be in place to enable longterm preservation of research results in digital form. Sustainable funding for preservation is crucial as curation costs for digitised content are still relatively high. Given the importance of preservation for the future use of research results, the establishment or reinforcement of policies in this area should be recommended to Member States."

(7) also notes that this has *traditionally* been the remit of archives and libraries, though I hope the vision of the European Open Science Cloud (EOSC) as a "trusted, open environment for the scientific community for storing, sharing and re-using scientific data and results" is seen as a common infrastructure component rather than a replacement for the 'traditional' centres of preservation practice... The quoted statement from the European Cloud Initiative that it will "start by federating existing scientific data infrastructures, today scattered across disciplines and Member States" (8), rather implies a distributed archipelago of disconnected practice, rather than the collegial networks of experts many of us feel we already benefit from. 

The remainder of the recitals plays to the requirement that  "all accessible data held by a public sector body needs to also be reusable for commercial and non-commercial purposes by all interested parties under non-discriminatory conditions for comparable categories of re-use and at the marginal cost linked to the distribution of the data, at maximum" (Directive 2003/98/EC) (4)) with a scattering of the key words one might expect: open access, use and re-use, licensing, collaborative, volume, professional development. 

Though the mechanisms are European, there is an acknowledgement that this is a "worldwide endeavour" with a need for a response on a "global level" (12). I'd like to think we've progressed enough to ensure that these recommendations and ongoing cooperation can encompass third countries as well (*cough* Brexit *cough*).

Recommendations

The main recommendations [1.] to [9.] each state a clear requirement for national action plans and policies which provide for:

  • concrete objectives and indicators to measure progress; 

  • implementation plans, including the allocation of responsibilities and appropriate licensing; 

  • associated financial planning.

Open access to scientific publications

Is explicit about the stakeholders who should be able to access scientific publications: "innovative companies, in particular small and medium-sized enterprises, independent researchers (for instance citizen scientists), the public sector, the press and citizens at large" [1.]. With a clear requirement for transparency about agreements between public institutions and publishers and a target for all publicly funded research publications to be open access by 2020 and that these become open no later than six months after publication (twelve months for social sciences and humanities). This vision of openness includes licence terms which "do not unduly restrict text and data mining of publications". 

Delivery is to be supported by institutional policies, guidance on compliance, funding for dissemination and open access as a condition of funding [2.].

Management of research data, including open access

Calls for data management as standard from the point of data collection or generation, with information "as open as possible as closed as necessary", FAIR compliant (findable, accessible, interoperable and re-usable), and held within a "secure and trusted environment" [3.]. Access to and preservation of research data is to be assured through data management planning skills and digital infrastructures (including EOSC) with an explicit requirement that "datasets are easily identifiable through persistent identifiers and can be linked to other datasets and publications through appropriate mechanisms, and that additional information is provided to enable their proper evaluation and use". 

Stakeholders are as [1.] above and delivery reflects the same targets of policy, guidance and funding support. A national requirement for DM plans is mentioned, as is their inclusion as a basic principle in grant agreements and other financial support.

Preservation and re-use of scientific information

[5.] is fairly brief and to the point, recommending preservation policies, effective deposit systems, ensuring that scientific information selected for long term preservation “receives appropriate curation, along with hardware and software necessary to allow the re-use”, and that conditions permit “value added services” based on re-use.

Persistent unique identification for findability, reproducibility and preservation is covered within the wider context of linking “research outputs, researchers, their affiliations and funders, and contributors” and there is welcome guidance that licensing systems and conditions should become machine-readable. 

Infrastructures for open science

[6.] and [7.] reflect a drive towards researcher access to "resources and services for storing, managing, analysing, sharing, and re-using scientific information" through economically efficient infrastructures. The quality, reliability and interoperability of these infrastructures (including EOSC) are to be assured through data and service standards and metrics that support evaluation of research, careers, impact and openness. 

Skills and competences

[8.] Seeks continuous relevant training throughout education and work covering:  open access, data research management, data stewardship, data preservation, data curation and open science. With specific reference to data specialists, technicians and data managers in data-intensive computational science.

Incentives and rewards

Evaluation of researcher recruitment, careers, research grant award processes and research institutions are all in scope here. Support and rewards are sought for early sharing and open access to publications and other research outputs with a clear focus on 'new generation metrics' that also provide indicators about the "broader social impact of research" [9.].

So…

From a repositories and archives perspective the recommendations provide concise criteria as we progress towards integrated infrastructures and add linked open data at scale to the file-driven technologies we’re familiar with. It might even provide some clues as to the mysterious 'technical and organisational measures' we must all provide evidence for under the GDPR. 

I, for one, look forwards to skilled, incentivised and rewarded people delivering global, seamless scientific infrastructures, which assure access to re-use-ready, well-managed and preserved data that underpin open access publications. But what will we do in 2019?

I’m currently sitting on the DPC Advocacy and Communications Committee, so if you have any comments, questions or criticisms of the above feel free to drop me a line at This email address is being protected from spambots. You need JavaScript enabled to view it. and I’ll collate and share.


Hervé L’Hours is the Repository & Preservation Manager at the UK Data Archive and UK Data Service at the University of Essex. He is current vice-Chair of the CoreTrustSeal and Chair of the CESSDA Trust Group, so has all sorts of reasons for worrying about this stuff. All opinions and poor attempts at humour are entirely his own.


Scroll to top