Kalpana Shankar

Kalpana Shankar

Last updated on 21 November 2018

Kalpana Shankar is Head of the School of Information & Communication Studies at University College Dublin

How do you keep a data archive open and relevant for fifty years or more?

For the last four years, my colleague Professor Kristin Eschenfelder (School of Library and Information Science, University of Wisconsin-Madison, USA), and I have been asking (and trying to answer) that question, along with many other questions, as part of a research project on the sustainability of data archives.  Thanks to the Digital Preservation Coalition for giving us the opportunity to talk about this project and why it matters on World Preservation Day.

We came to this project because we are both scholars who are interested in, among many other things, information infrastructures.  That is, we’re interested in how information systems that support and facilitate research are conceptualised, developed, and most importantly, maintained. These platforms are designed to facilitate collaborative, large-scale, distributed research – often called “cyberinfrastructures” in North America or “e-science”/”e-social science” in Europe (the e-, of course, is for electronic).  These systems have supported and helped develop new kinds of tools and techniques for sharing data, conducting analyses, collaborative work, and have made possible new kinds of research that were not possible without the scope and scale of these tools. For several decades, these infrastructures have provided researchers and information professionals fertile ground for studying collaboration, data practices, digital preservation, and contemporary scientific work.   

We noticed what wasn’t being discussed: organisational and institutional longevity and sustainability.  How do these projects keep going over the long term, beyond the few years of grant funding that gets them started? What happens to the tools and data they’ve generated?  What can we learn from data archives that have persisted over time?

For our project, we decided to focus deeply on case studies of data archives with long histories: social science data archives (SSDA).  For over sixty years, social science data archives in the US and Europe have been curating and preserving (primarily quantitative) data derived from political science and economics research, census data, administrative data from government agencies, and other datasets. SSDA predate the Internet and digital data, but have been and continue to be influential in making high quality data available to the social science research community, policy makers, and civil servants.  Their long history provides an opportunity to examine the back-end of infrastructure over time — massive changes in technical and organizational infrastructure, changes in product pricing and packaging, and changes in professional information practices — in the context of the ups and downs of funding cycles and changing fashions in the social sciences.  SSDA have also been extensively involved in data standards development, the professionalisation of data librarianship, open data, digital curation, and the training of social scientists and data librarians. New knowledge about the history of SSDAs can contribute to current conversations on the long-term sustainability of all data and knowledge infrastructures.

We have spent the last few years conducting research in several SSDA in the US and Europe – ICPSR, UKDA, LIS, Roper Opinion Polls, and Berkeley to learn how they have maintained themselves over time – or in some cases, how and why they were discontinued.  In addition to documents from these organisations (correspondence, contracts, memoranda, policy documents, among others), we also conducted interviews with current and former staff and administrators.  We also have obtained similar documents from “meta organisations”: the International Federation of Data Organisations (IFDO), the Council for European Social Science Data Archives (CESSDA), and the major professional organisation for social science data librarians, the International Association of Social Science Information Services and Technology (IASSIST). 

While we are still analysing this rich data, we have been reflecting in the literature on the complex negotiations and strategies that SSDA have employed to stay open over time.  Relationships with each other, evolving business models, the articulation of value and mission, and the role of field level organisations (such as IFDO and CESSDA), and standards work have been some of the topics on which we’ve written.   We note that this kind of research is timely as data organisations around the world are contending with maintaining funding and evolving their own business models (the OECD released a report in December 2017 that is a deep analysis of the specifics of business models in data archives). Our analysis will be useful for those planning future data archives as it will give them a better understanding of the types of changes they should expect to make over time in order to keep their project afloat.

In short, what we have been focusing on throughout is the documentation of labour that goes into “keeping the trains running on time over time” with our close attention to relationships, business models, funding streams, partnerships, stakeholders, mission, and values.  With respect to World Digital Preservation Day, for which this post was written, we intend to raise awareness that understanding institutional  and organisational dimensions of data preservation are essential to successful data preservation.

For more information on our project and access to our publications, please refer to our Website:  https://kreschen.wordpress.com/social-science-data-archives-history-and-sustainability/

Scroll to top