Miguel Ferreira

Miguel Ferreira

Last updated on 21 November 2018

Miguel Ferreira is Executive director at KEEP SOLUTIONS based in Braga in Portugal

“If they can’t agree on the plug, how can they ever agree on the metadata?”… This sentence has stuck with me for over 13 years now. It came about in the summer of 2005, when a DSpace user group meeting was about to take place in the beautiful city of Cambridge, in the UK. The meeting intended to give a voice to real-world users and to provide an opportunity for young developers such as myself to learn from the collective wisdom of all of those present.

I was a young researcher at the time working in digital repositories and open access to research publications, which was, and still is, held as an unquestionable religion at the University of Minho where I worked.

It was an exciting time for me, as it was the very first time I was traveling abroad to present my work to an international audience. The environment at the venue was amazing! DSpace was an exciting new product and academic institutions around the world were adopting it for safekeeping their institutional memory and scientific production.

Younger attendees were lodged in a students’ residence conveniently unoccupied during the holiday season. After arriving at my modest room and finishing unpacking my luggage, I needed to recharge my laptop. That’s when I realised that the power outlet was somewhat unfamiliar to me. I rushed downstairs to where I had checked-in a few minutes before just to find out that five other people were engaged in the same quest.

In front of me was a young fellow from the University of New York. His job there was to learn more about DSpace and bring back valuable information about that community to the people at his institution. After a quick chat about the oddity of the whole situation he came up with that unforgettable sentence - If they can’t agree on the plug, how can they ever agree on the metadata…

This of course resonated with me until today. It was the perfect metaphor for a growing concern regarding interoperability between institutional repositories from different families, an issue expected to be discussed thoroughly during the next few days.

Years have passed, and I moved on to other jobs, always in the context of information management and digital curation. In 2008 I founded a company together with a couple of fellow researchers. We have been raising awareness on the risks of technological obsolescence while at the same time developing products that help mitigate those risks.

As a company we have participated in a few relevant EU funded projects in the field of digital preservation. Projects such as SCAPE that focused on the technological challenges of scaling up digital preservation processes to the hundreds of millions of objects. The 4C project that dealt with the financial aspects of digital curation and where techniques for estimating the long-term costs of preservation been developed. The VeraPDF which aimed at creating an independent validation tool for files encoded in PDF(a). And finally, the E-ARK project, a project that aimed at creating a small set of specifications for structuring digital information within packages that would simplify the communication between producers and digital archives and between archives and end-users.

The project delivered specifications for submission, archival and dissemination information packages. The promise was that different vendors could exchange digital content and all of its interconnected informational components (e.g. descriptive, technical and preservation metadata) amongst distinct software applications.

These specifications have the potential of having a huge impact on the way digital preservation is implemented in the real world today. By adopting these specifications, multi-vendor electronic record management systems are able to send information to a variety of digital archiving systems without the need for custom developments or tedious negotiation processes between software vendors to align data models and APIs.

Archives gain freedom to choose repository solutions as they are assured that their data will not be entrapped in any particular software architecture. EARK compatible repository software can be updated or replaced entirely without the need to make changes to the underlying data or data models. An EARK compliant software will be able to absorb existing Archival Information Packages with mere changes to its configurations.

On the access side, distinct viewing applications can be used to deliver content to end-users as information is packaged in such a way that is recognisable to compliant applications.

There is still a long way to go to make this vision a reality. Only a handful of software applications currently support these specifications. Amongst the ones that do support them we can already find electronic records management systems, long-term digital repositories and content viewing applications.

After the end of the EARK project, the European Commission recognised the importance of this work and has reconditioned EARK as an ongoing Building Block of the Connecting Europe Facility programme (CEF), a key EU funding instrument to promote growth and competitiveness at the European level.

The overall objective of the CEF programme is to enhance Trans-European Networks and infrastructure (both physical and digital infrastructures). The CEF aims to improve the daily life of citizens, businesses and public administrations through the deployment of interoperable infrastructures based on mature technical and organisational solutions.

The goal of the recently created eArchiving Building Block is to enhance and promote the use of the specifications developed during the EARK project and turning them into standards for exchanging digital information in the context of an open archival information system (OAIS).

The aim of the eArchiving programme is to provide specifications, software, training and knowledge to help data creators, software developers and digital archives tackle the challenge of short, medium and long-term data management and reuse in a sustainable, authentic, cost-efficient, manageable and interoperable way.

The core of eArchiving is formed by Information Package specifications which describe a common format for storing bulk data and metadata in a platform-independent, authentic and long-term understandable way. The specifications are ideal for migrating long-term valuable data between generations of information systems, transferring data to dedicated long-term repositories (i.e. digital archives), or preserving and reusing data over extended periods of time and generations of software systems. 

Additional to the specifications, the eArchiving programme offers a set of sample software to demonstrate the use of the specifications in different business environments, and consultancy in regard to long-term digital preservation risks and their mitigation.  

The service is organised in 4 main activities:

  1. Specifications management –work focused on updating the specifications to meet the requirements of a broader audience and start the revision cycle of the specifications;
  2. Software portfolio, compliance services and helpdesk - an activity focused on delivering software and libraries to ease the process of adopting the specifications by institutions and software developers, as well as providing technical support to those who want to adopt them;
  3. Training - to promote the uptake of eArchiving and overcome barriers to entry, improve adoption rates, and assist in meeting the development needs of individuals and organisations engaged in eArchiving;
  4. User engagement and outreach - to build long-term relationships and ensure a coordinated approach when engaging with stakeholder communities. This means raising awareness, informing target communities about project releases, promoting the specifications, cooperating with user communities and influencing key actors who shape the eArchiving ecosystem while at the same time assuring the effectiveness of the entire communication strategy.

All of this means that as a society we have now surpassed the stages of research, and are now entering a new, more mature, phase where professional services and ongoing support are available to help communities be one step closer of having a universal plug for exchanging information in a digital archiving environment.

Scroll to top