Louise Preston

Louise Preston

Last updated on 11 May 2023

Louise Preston is a Project Officer at the National Archives of Australia. She attended FDO2022 with support from the DPC Career Development Fund, which is funded by DPC Supporters.


Writing systems developed in Mesopotamia and other ancient societies to manage information because human memory simply could not store all information. It was a new and specialised field that began with only partial script. As writing and recording became more complex, the amount of information stored grew, becoming difficult to find and retrieve. Ancient scribes learnt to use catalogues, dictionaries, tables, forms and calendars to access and retrieve information (Harari, Yuval Noah, Sapiens, 2015).

Today, we are still grappling with how to manage and retrieve information but with the additional tier of machines to store our information. We have developed inexorably large repositories to store vast amounts of digital information kept in different formats in different systems. The 1st International Conference on FAIR Digital Objects (FDO2022), held from 26 to 28 October in Leiden, Netherlands, hosted by the Fair Digital Objects (FDO) Forum, focused on how to apply the FAIR (Findable, Accessible, Interoperable, Reusable) guiding principles to managing that increased volume and complexity using computational systems. The principles, which apply to data, metadata and the infrastructures maintaining them, were developed in Leiden for the management and stewardship of scientific and scholarly data and published in 2016 (Wilkinson et al., 2016). An international set of principles, they do not prescribe any technology, standard or implementation method.

I was fortunate and privileged to receive a grant from the DPC to attend the conference and experience first-hand the reports of how digital objects such as data, metadata and software are being managed using the FAIR approach. The conference showcased multiple studies and projects working to implement the guidelines so that data in scientific research repositories can be found, accessed, made interoperable and reused. FDOs are an attempt to remove silos of data by linking different data sources, metadata, data rights applied, etc., to help us verify the source of the data and how it can be used.

The conference themes of technology, research and policy attracted many speakers and attendees from the scientific research and technology communities. There was an emphasis on technology in the presentations yet the human and social context could not be avoided. This was particularly evident in discussions about:

  • improved ways for machines to manage and access vast data repositories;
  • standards and protocols for FDOs;
  • different access levels to address privacy and security issues.

Machines need to use legitimate, trusted methods to automatically manage and retrieve data. FAIR Digital Objects were presented as a way to improve trust in information accessed on the Internet. Keynote speaker Bob Kahn, a co-founder of the TCP/IP protocols for the Internet, proposed the concept of a system for managing digital objects in 2007 (Kahn, 2007). Humans cannot access information with the speed and coverage that a computer can; however, as Kahn pointed out, the latter cannot understand the semantics used by humans. He asked how machines could process philosophical queries such as Elizabeth Barrett Browning’s poem that begins, “How do I love thee? Let me count the ways.”

FDOs seek to do for computer readers what the web does for humans with Google, claimed conference presenter George Strawn of the US National Academy of Sciences. One initiative to achieve this goal is the Semantic Web, using technologies to formally represent metadata and analyse data to integrate it, in a language that computers can understand. The idea is to have autonomous agents that act on our behalf but it remains to be seen how they will interpret sarcasm or poetry.

Bob Kahn noted that compatibility is a large and growing problem. There are data tapes in storage that cannot be read by machines currently being manufactured. There are issues with filing and storage: digitising and putting objects on the Internet can be a solution but, ultimately, the format compatibility and storage problems must be handled by someone. Archivists are all too familiar with these topics. The human and social factors evident from these points did not escape Henrik Tom Wörden of IndiScale GmbH, when he questioned who adds the required metadata and that although it needs to be the researcher or scientist, they don’t have much incentive to add it, especially with few tools available to assist them in metadata management.

An example of FDO implementation announced was the Research Data Repository (RADAR), which many institutions now use according to Dr Felix Bach, who is affiliated with FIZ Karlsruhe, Germany. The repository set up an infrastructure for research data management. It is intended to permanently store research data and metadata from projects. The RADAR API can be used to upload metadata but metadata schema editors need to be linked to terminologies because there are so many vocabularies. Additionally, there are multiple standards. Carole Goble, Professor of Computer Science at the University of Manchester, noted that there are 1600 standards that we know about for scientific data and would be difficult to map. The multiplicity of ontologies and standards is a hurdle for managing scientific data and results from different approaches adopted by different people and organisations.

Persistent identifiers to make scientific research data findable on the Internet were also discussed, being a key part of the FAIR principles. For instance, Digital Object Identifiers (DOIs) and Handle are recommended by the Australian Data Archive for citations of scholarly research data it manages. One presenter, Donny Winston of FAIRPoints, promoted the use of the Archival Resource Key (ARK), which is used by various archives, museums, data repositories and publishers, to provide reliable references to scholarly, scientific and cultural objects. Not all institutions use them and any system like this needs uptake and funding, with a heavy dependence on humans and institutions adopting and servicing those processes.

Access levels and security of information for scientific datasets were crucial to the talk by Anne Fouilloux of Simula Research Laboratory, Norway. She cautioned about providing full open access. She demonstrated this with a situation where geographical locations of a new species had to be withheld due to poachers using scientific papers to find those species (The Guardian, 1 January 2016). Fouilloux made the valid point that you can’t share everything; you need a protocol for how to access data. Her organisation consequently not only applied FAIR but also CARE (Collective Benefit, Authority to Control, Responsibility, Ethics) principles in its open earth science research projects. She added that the CARE principles are not machine-readable.

The conference concluded with the Leiden Declaration on FAIR Digital Objects, which states its aims as follows:

  • Support the FAIR guiding principles to be applied (ultimately) to each digital object in a web of FAIR data and services;
  • Support open standards and protocols;
  • Support data and services to be as open as possible, and only as restricted as necessary;
  • Support distributed solutions where useful to achieve robustness and scalability, but recognise the need for centralised approaches where necessary;
  • Support the restriction of standards and protocols to the absolute minimum;
  • Support freedom to operate wherever possible;
  • Help to avoid monopolies and provider lock-in wherever possible.

The word ‘ultimately’ in the first point suggests FAIR solutions are still some time away. It is also clear from talks and discussions at the conference that despite its focus on technology, humans still have a lot to do to get it right so machines can do their bit as autonomous agents acting on their behalf. It is people who need to adopt consistent approaches, use persistent identifiers and apply metadata. Communities need to collaborate. Access levels and privacy require compliance. These are societal and cultural factors requiring human input. Without this input, machines are perhaps like the early ancient scribes working from a partial script.

The management of data, datasets and metadata are integral to information management, archives management and the digital preservation field. Discussions at the conference returned repeatedly to topics familiar and salient to these professions. The National Archives of Australia's Check-up survey of Australian Government agencies' information management, for instance, includes questions about the ability of those agencies to find, access, reuse and assess interoperability of information assets. Section G of the 2022 survey, in particular, 'Use, reuse and interoperability', pinpoints data and metadata questions that align with the FAIR principles. That section also addresses many of the same topics of concern raised at the conference, including agreed standards, controlled vocabulary, standardised metadata and open access when possible. Conference discussions about these attributes highlighted the need for trustworthy information, showing analogous regard to that of the information management professions for reliability, integrity, usability and authenticity.

The conference showcased initiatives by bringing together the different communities of science, technology and policy in the pursuit of FDOs. As the FAIR guidelines were developed for scientific data management in research data repositories, this was a logical first step by the FAIR Digital Objects Forum. Also acknowledged is that this was just the first international conference on FDOs. It was helpful to learn about FDO initiatives and it is arguable that collaborative efforts with the wider information management sector can only benefit and support efforts towards FDOs at future conferences and, in general, for all fields involved.

 

References

FAIR Digital Objects Forum website https://fairdo.org/about/.

GO FAIR website FAIR Principles - GO FAIR (go-fair.org).

Harari, Yuval Noah Sapiens: A Brief History of Humankind Chapter 7 Memory Overload Vintage London 2015 [First published 2011]

Kahn, Robert Managing Digital Objects on the Internet (pswscience.org ) Lecture on 26 October 2007 https://pswscience.org/meeting/managing-digital-objects-on-the-internet/

Leiden Declaration on FAIR Digital Objects https://www.fdo2022.org/programme/leiden-declaration-fdo

Poachers using science papers to target newly discovered species, Arthur Nelson, The Guardian 1 January 2016 https://www.theguardian.com/environment/2016/jan/01/poachers-using-science-papers-to-target-newly-discovered-species

Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 3, 160018, March 2016, http://doi.org/10.1038/sdata.2016.18


Scroll to top