Further Resources and case studies

This section highlights some useful resources recommended by our expert workshop participants.

Case studies

The case studies listed in this section provide helpful examples of how computational access techniques have been applied in practice.

Discovering topics and trends in the UK Government Web Archive by the Data Study Group at the Alan Turing Institute – ‘A detailed case study demonstrating a series of experiments to unlock the UK Government Web Archive for research and experimentation by approaching it as a dataset.’ Jenny Mitcham, Digital Preservation Coalition
Providing computational access to the Polytechnic Magazine (1879 to 1960) by Jacob Bickford – ‘A practical and accessible case study describing a project at the University of Westminster which used NLP to provide computational access to a collection of digitized material. It includes links to useful resources that helped with getting started, and the code developed is also shared so that others can make use of it.’ Jenny Mitcham, Digital Preservation Coalition
Datathon slides highlighting possibilities with the material by the Archives Unleashed Project – ‘A collection of slides from datathons run as part of the Archives Unleashed project, these highlight a number of computational methods that could be applied.’ Leontien Talboom, University College London
Using AI for digital selection in government by The National Archives (UK) – ‘Outputs from an experiment which tested and evaluated a series of tools to undertake automated classification of a dataset to predict retention labels.’ Jenny Bunn, The National Archives (UK)
Reflecting On a Year of Selected Datasets by Predo Gonzalez-Fernandez – ‘A useful write up of how datasets have been made available at the Library of Congress.’ Jacob Bickford, The National Archives (UK)
Exploring National Library of Scotland datasets with Jupyter Notebooks by Sarah Ames and Lucy Havens – ‘An interesting example using Jupyter Notebooks to explore National Library of Scotland collections in a different way.’ Leontien Talboom, University College London
Web Data Research blogs by the Internet Archive – ‘Multiple examples of types of use and various tools, platforms, and partnerships.’ Jefferson Bailey, Internet Archive
Machine Learning with Archive Collections by Jane Stevenson – ‘This blog post shows the possibilities and potential of machine learning when applying it to archival collections.’ Leontien Talboom, University College London

The case studies below were shared as part of a series of events to promote the launch of this guide and represent a range of different approaches to computational access.

Jacob Bickford, The National Archives UK - ‘DIY’ Computational Access: the Polytechnic Magazine (1879 to 1960)

Sarah Ames, National Library of Scotland - Collections as data at the National Library of Scotland: access, engagement, outcomes

Ian Milligan, University of Waterloo - Providing Computational Access to Web Archives: The Archives Unleashed Project

Ryan Dubnicek and Glen Layne-Worthey, HathiTrust Research Center, University of Illinois Urbana-Champaign - How to Read Millions of Books: The HathiTrust Digital Library and Research Center

Jefferson Bailey, Internet Archive - Tales from the Trenches: Building Petabyte-scale Computational Research Services

Tim Sherratt, University of Canberra - Living archaeologies of online collections

Examples of approaches and infrastructures

The list below provides some helpful examples for you to explore, showing how different organizations have opened up their collections for computational access using a variety of techniques:

SHINE – ‘This is a prototype search engine for UK Web Archive with trend analysis. It is a good example of what is possible but is also very user friendly – a great way of introducing people to looking at a digital resource in a different way.’ Jacob Bickford, The National Archives (UK)
GLAM Workbench – ‘A set of Jupyter notebooks, demonstrating different computational techniques. This is great for its broad range of subject matter and exposing the underlying code.’ Jacob Bickford, The National Archives (UK)
SolrWayback – ‘Search tools for web archives, again a great demonstration of what is possible.’ Jacob Bickford, The National Archives (UK)
DigHumLab – ‘A digital ecosystem that highlights a number of collections open for computational access, also includes some inspirational examples of how researchers have used these collections.’ Leontien Talboom, University College London
Data Foundry – ‘This is a good example of access to National Library of Scotland data collections.’ Jane Winters, University of London
Web Archive Datasets – ‘Datasets made available by the Library of Congress Labs including Jupyter notebooks on working with the meme collection.’ Jacob Bickford, The National Archives (UK)
The Cybernetics Thought Collective – ‘A digitization project that also provides a web portal and analysis engine and highlights the potential of computational methods for this type of material.’ Leontien Talboom, University College London
BigLAM (Libraries, Archives and Museums) - ‘This collaboration showcases how there is an interest in GLAM material in the form of data. The provided datasets also highlight what attributes may be of importance to people wanting to use this material for different computational methods.’ Leontien Talboom, University College London

Case studies

Examples of approaches and infrastructures

Further reading