Practical Tools for Digital Preservation: A Hack-a-thon


As volumes and complexity of data grow, so digital preservation processes rely on increasingly automated tools and processes. But preservation actions are often specific to the workflows and functions of individual institutions. So, although our toolkit has grown significantly, the very particular demands of individual workflows create a subtle and stubborn gap between tools and their deployment. This frustrates archivists and curatorial staff who want to simplify and automate their processes; and it inhibits developers who seek to provide tools that are satisfying and dependable.

This ‘hackathon’ will bridge that gap by facilitating a closer relationship between individual developers and digital collection owners. It will provide a forum for practical problem solving. It will help collection owners to articulate their requirements in ways that developers can address; and will help developers respond more precisely to the needs of a community hungry for solutions.

This event follows the success of the ‘AQuA’ project mashup events (http://wiki.opf-labs.org/display/AQuA/Home) which developed solutions to automate quality assurance of digital collections. In a similar way it will bring together digital collection owners and technical experts to address validation issues and long term access to digital content. It provides an opportunity to contribute your experience, and also to learn from the diverse skills and backgrounds of the other participants.

Who should come?

  • Collection owners who can bring along samples of their problematic digital collections. You will be asked to give a short talk to provide an overview of the content and the known or potential issues to the group.
  • Developers / technical experts who want to gain hands-on experience of applying digital preservation techniques to digital collections. You will be asked to give a short talk about your technical experience and interests.

We'll match up the collection owners with the technical experts to further discuss the issues before developing and applying preservation tools to meet your needs. We will run parallel workshops for collection owners to learn more about articulating requirements and creating concrete use cases to feed into tool development.

Provisional Programme

DAY 1: Tuesday 27 September



Practitioners sessions

Developers sessions

10.00 – 10.30


Coffee and registration

10.30 – 11.00


Welcome, introductions, housekeeping, overview of aims, sessions & structure (wiki and github)

11.00 – 11.30


Introduction to collection issues and articulating requirements

11.30 – 12.30

All CO

All Dev

Collection owners presentations: outline content and issues

Developers presentations: interests and technologies

12.30 – 13.15



13.15 – 14.00


Pair/Group CO & Devs: discuss collection issues in further detail, record on flipcharts and feedback to the group

14.00 – 14.30


Creating requirements and use cases

Brainstorming (tools)


14.30 – 15.30

BvdW, KA, WK

Break out groups: requirements and use case workshops

15.30 – 15.45


Coffee break


15.45 – 16.00

BvdW, KA, WK

Write-up session: populate wiki with collections outlines, issues and requirements


16.00 – 16.15


Check in: CO and Devs to refocus hacking

16.15 – 16.45


Devs and COs feedback to the group




DAY 2: Wednesday 28 September

09.00 – 09.15



09.15 – 09.45


Check in: CO and Dev. Document on wiki

09.45 – 10.45


Planets one year on: OPF and SCAPE


10.45 – 11.00



11.00 – 12.15


Plato demo and requirements sessions


12.15 – 13.15



13.15 – 14.30


Fido demo and requirements sessions


14.30 – 15.30


(Taverna/Curators Workbench?) demo and requirements session

15.30 – 15.45



15.45 – 16.15


Check in: CO and Dev, continue write-up on wiki

16.15 – 16.45


Devs feedback to the group

16.45 – 17.30


Dev hacking



Hackathon: check availability of student accommodation/common room/wiki etc.

DAY 3: Thursday 29 September

09.00 – 09.15



09.15 – 10.30


Check in and write-ups: CO and Devs

10.30 – 10.45



10.45 – 12.15


Widget wish-list


12.15 – 13.15



13.15 – 14.15


Training / demo session??


14.15 – 15.00


Check in: CO and Dev

15.00 – 15.15



15.15 – 16.00


Evaluation and awards

Read More

DCC Research Data Management Roadshow: Brighton 4-6th October

Added on 26 September 2011

Places are still available at the 5th Digital Curation Centre Roadshow.  It is being organised in conjunction with the University of Sussex Library. It will take place over 4 - 6 October 2011 at the University of Sussex Conference Centre, Brighton. http://www.dcc.ac.uk/events/data-management-roadshows/dcc-roadshow-brighton

Read More

Deadline extended for grants to attend UKDA training course

Added on 21 September 2011

The deadline for applications for grants to attend the UK Data Archive's popular 'How to run a Data Centre: the Challenges of Social Science Data' has been extended to 7th October 2011. 

Read More

University of Leeds joins the Digital Preservation Coalition

Added on 19 September 2011

The University of Leeds has joined the Digital Preservation Coalition.

'Over the last few years our digital collections have grown and diversified', explained Bo Middleton of the University Library. ' They represent a considerable investment and we must move to protect these assets through active preservation.'

Read More

The Digital Preservation welcomes Leeds University as its latest associate member

Added on 16 September 2011

The University of Leeds has joined the Digital Preservation Coalition.

'Over the last few years our digital collections have grown and diversified', explained Bo Middleton of the University Library. ' They represent a considerable investment and we must move to protect these assets through active preservation.'

'We have joined the DPC because we want to be active participants in discussions of key digital preservation issues. We also recognise the benefits that will accrue from access to world class research. We are delighted to become members of the key forum for the digital preservation community which is working to develop policies and encourage the adoption of best practice.'

Read More

DPC Response to EU Science Information Policy Consultation

The DPC has responded to a consultation from the EC regarding science information policy, noting that the impacts sought from improved access to scientific information are only viable where sufficient attention is paid to preservation. 

Preservation has a particular importance for scientific information because meaningful innovation is necessarily responsive previous generations of research. In that sense, preservation of appropriate research outputs is essential to all sciences, especially for unrepeatable experiments or unique moments of discovery. Aspirations about access to information are meaningless without commensurate actions that to ensure preservation. We welcome all actions that will encourage a dialogue between and within member states to ensure the preservation of scientific information and we call on the EU to engage in that dialogue as a matter of urgency, using existing examples of best practice to help build capacity.

Read the original consultation from the EC

See the full text of the DPC response.

Read More

DPC Responds to EU Science Policy Consultation

Added on 8 September 2011

The DPC has today responded to a new policy consultation from the EU regarding preservation and access to scientific information.

Read More

DPC Prospectus 2011-12 released

Added on 5 September 2011

The DPC has today released its prospectus of activities for 2011 and 2012.  The prospectus explain the work of the Coalition and the benefits of membership for the 12 months ahead.  It outlines the specialist working parties, expert events and publications that members will get priority access to over the period, including:

Read More

The Future of the Past of the Web: London 7th Octber 2011

Added on 5 August 2011

The DPC, JISC and the British Library invite you to a wokshop and conference at the British Library Conference Centre, Euston Road, London on Friday 7th October 2011.

The Web expands at an astonishing rate. Statistics suggest that more than 70 new domains are registered and more than 500,000 documents are added to the web every minute. This rapid expansion continues to challenge those charged with preserving an effective memory of the web.

Read More

Preserving Email: Directions and Perspectives

Email is arguably the most ubiquitous, inexorable and voluminous manifestation of information technology. It is a defining characteristic of our age and a critical element in all manner of communications and transactions. Industry and commerce depend upon email; families and friendships are sustained by email; government and economies rely upon email; communities are created and strengthened by email.  It is sometimes hard to remember how we functioned before the widespread adoption of email in public and private life. But for all the importance of email and the transactions it supports, it is surprisingly absent from much of the digital preservation literature.  Institutions, organizations and individuals have a considerable investment and in many cases statutory requirements to safeguard large collections of email, so there ought to be a strong body of experience and clear workflows to follow.  So why is there so little detailed advice available?

To some extent email encapsulates many of the core challenges of digital preservation.  It would be simple to preserve if it were not for the infinite variety of attachments that go with it; it would be simple to preserve if we could eliminate all the duplicates and spam; if we could remove all the personal details; if we could resolve the copyright issues; if we could resolve access and security barriers. These and other subtle, complex demands mean that the relatively simple proposition of preserving our collected digital correspondence can be blighted by interminable wrangling over procedure, policy and technology.  Nonetheless the preservation of email creates a readily understood basis to engage with the widest possible audience with digital preservation.  It provides a pervasive environment for innovation and assessment of digital preservation tools and services.  It will be a necessary component to ensure our digital memory is accessible tomorrow.

This DPC briefing day will provide a forum for members to review and debate the latest developments in the preservation of email. Based on commentary and case studies from leaders in the field, participants will be presented with emerging policies, tools and technologies and will be encouraged to propose and debate new directions for research.  The day will include a discussion of key topics such as:

  • lifecycle management of email
  • Ingest, documentation and accession of email archives
  • Emerging tools and policies for preservation of email

Who should come?

This day will be of interest to:

    • Collections manager, librarians, archivists in all institutions
    • Tools developers and policy makers in digital preservation
    • Innovators and researchers in information policy and management
    • Innovators and researchers in computing science
    • Vendors and providers of email services

Draft Programme Outline

1030      Registration and Coffee

1100      Welcome and introductions (William Kilbride, DPC)

1105      The Nature of the Problem (Chris Prom, University of Illinois)

1135      Why preserving email is harder than it sounds - theory and practice (Stephen Howard, Information Management Officer, the United Nations)

1205      Receiving and managing email archives at the Bodleian Libraries - a case study (Susan Thomas, Bodleian Libraries)

1235      Discussion and questions

1245      Lunch

1330      Email management: 15 wasted years and counting (Steve Bailey, JISC InfoNet)

1355      Past, present and future in email preservation: practical experience and future directions (Maureen Pennock, British Library)

1420      Emerging tools for email preservation (Tom Jackson, Loughborough University)

1445      Discussion and questions

1500      Coffee

1515      Discussion and panel (led by Tim Gollins, TNA)

By 1600 Close

Read More

Scroll to top