Documentation interview with James Doig, Assistant Director, Digital Archiving Innovation and Research, National Archives of Australia - July 2023

Why is digital preservation documentation important to you and your organization?

Digital Preservation documentation takes a number of different forms at the National Archives of Australia (NAA), including procedures and guidelines. 

DP documentation is critical to us for a number of reasons:

  • Clear and consistent approach to digital preservation/digital archiving activities and processes.  Clarity and consistency is important given the complexity of digital transfers from agencies 

  • Risk mitigation - given the risks involved in managing digital records (inherent fragility etc) a clear risk mitigation is to have complete and accurate documentation to ensure consistent treatment of digital records

  • Compliance and auditing - artifacts like SOPs are also important for auditing purposes.  We often have internal and external audits of functions and processes.  There is a need for transparency and sound records management.

  • Business continuity and succession planning - very important for new staff coming on board to have clear instructions/documentation

  • Training and onboarding/induction - procedures etc are important as a training resource

  • Knowledge retention - Content Manager/TRIM is our corporate recordkeeping system and all key documents/records are stored there.  We need to ensure that current procedures, processes, workflows are captured there

  • Communication and collaboration - allows other areas of NAA and other institutions to have a shared understanding of how we do things  

 

What tools or platforms do you use to create and provide access to your documentation? What works or doesn’t work well with these tools or platforms?

M365 and MS Office for document creation - we have Word templates for reports and other document types that include important information document owner, document approvals and document history.  All records need to be captured in Content Manager (with files and documents appropriately titled). Staff recordkeeping is not always great (eg titling, versioning, finalising), but you can usually find what you’re looking for. 

 

When do you create documentation and how often is it reviewed and updated?

NAA is currently in transition with the acquisition of Preservica, system development, organization restructure, so this aspect of digital archiving has been neglected and needs to be addressed.

In the past, documenting procedures and workflows has not been systematic and coverage has been quite patchy.   

 

What is the update process and how do you manage versions?

The document template should include a business owner and review date.  Versions are managed via the document approvals/history section of the template and versions are also managed in Content Manager (eg via document title).

 

What is next for digital preservation documentation at your organization?

There is a working Excel register titled Digital Preservation Task Logbook that sets out all completed and outstanding work, including documentation that needs to be created.  The spreadsheet includes columns for Task, Type (eg procedure), Priority, Risk, Complexity, Due Date, Responsible Officer, Status (complete, in progress, not started), Sign off (who the approving officer is), Comments, and Resource/Links (eg the Content Manager number of the document, link to a DPC page etc).

 

Are there any resources or examples that have been really useful to you in creating your own documentation?

Generally discussions with staff about gaps/requirements, DPC RAM and other audit tools

 

What tips do you have for people starting out on documenting their digital preservation activities?

  • Stakeholder engagement is always critical (could include mapping the business process through a methodology like Lean Six Sigma)

  • Use existing SOP and/or document templates that allow you to manage versions and approvals

  • Always include business owner and review date, think of the audience.

Read More

Interviews and Case Studies

Some of our focus group attendees have shared further information about their own documentation practices, and a number of other helpful case studies are also included below:

Presentations given at launch event

Interviews with focus group attendees

Other community case studies

  • In the Information Maintainers November 2020 meeting, Kaitlin Newson from Scholar’s Portal discusses the actions, routines, and communities that are needed to support and maintain the documentation needs of information systems: 

  • The DPC’s annual Workflow Webinar series can be a rich resource for learning about digital preservation workflows. These webinars sometimes touch on documentation and often include workflow diagrams as a part of the presentation. One interesting case study (using GitHub as a public documentation platform) can be found in a presentation from Julia Miller from the PARADISEC archive from 2023. This is an interesting presentation in which she shares some of the organization’s online documentation and also related questions about documentation in the Q&A session at the end.

Read More

Further Resources

Further Reading

OSSArcFlow Guide to Documenting Born-Digital Archival Workflows - This is a great resource describing the outcomes and lessons learned from a project which focused on creating a methodology for documenting digital preservation workflows with a range of organizations using open source tools. In particular the section ‘Documenting Born-Digital Workflows’ from page 30 onwards is helpful in providing tips for documentation and the appendices which include templates for documentation which were used throughout the project. The project also produced a short video on documenting workflows.

Write the Docs; - This is a global community of people who care about documentation. It focuses primarily on documenting code, but there is lots of useful information that is more broadly applicable to other types of documentation. They have a helpful Documentation Guide as well as an annual conference about documentation and a reading list.

Journal of Documentation - A whole journal devoted to the topic of documentation! Though the articles are often focused on documentation topics outside of the scope of this guide, there are likely to be some papers that are of interest. Occasional issues of this journal are open access.

Don’t Let All That Work Go to Waste: Documentation Strategies for Success - an online tutorial presented by Nathan Tallman and Carly Dearborn in April 2020 for the Council of State Archivists, State Electronic Records Initiative. There are some great tips here and helpful information about some of the tools that can be used to create documentation (including GitHub).

Tool demos, Digital Preservation Handbook - the Digital Preservation Handbook includes a number of demos of how to use commonly used digital preservation tools (for example DROID, Checksum by Corz and Teracopy). Do link to these demos from your own documentation if it is helpful to do so rather than trying to re-invent the wheel.

Examples

It can be incredibly helpful when organizations publish their digital preservation procedures. Often this is done with certification in mind - CoreTrustSeal for example encourages openness and transparency around procedures. Here are some good examples to look at:

TIB - This is a really good example of publicly available documentation. TIB make this available as a comprehensive series of interlinked wiki pages available in German and in English. See for example the page on the ingest process which includes textual descriptions of the process as well as detailed workflow diagrams.

Minnesota Historical Society - This ‘Digital Archivists Manual’ is published as a Google Doc to allow for comment and easy update. This is a great example of publicly available documentation, covering a whole range of digital preservation processes and procedures including instructions on how to run specific tools needed for the workflows.

Archaeology Data Service - The ADS has published much of their procedural documentation (as well as policies) on their website. This is a helpful resource providing detailed information about how this data archive handles a range of different scenarios. See for example data preservation procedures which cover both ingest and specific workflows for different data types.

Bentley Historical Library - The Bentley Historical Library provides published procedures and conventions for the arrangement and description of paper and digital archives as well as the curation of web archives. See for example the procedures specific to web archiving and those specific to digital processing.  

Rockefeller Archive Center - This is a nice example of a suite of documentation that has been built using GitHub and made available online. There is a range of policy and procedure documents here - see for example the Digital Media Transfer Workflow.

Archivists Guide to Kryoflux - This is a community created documentation effort for a specific piece of hardware. It is hosted on GitHub and includes a helpful description of the revision process and some tips on using GitHub.

Community Owned Workflows - a forum for sharing digital preservation workflows with the community, this is a helpful resource to browse to see how others have described and illustrated their workflows.

OSSArcFlow As Is Workflows - during this project, partners modeled and documented a range of born digital workflows using a standard documentation methodology. These workflows are available in this report.

Read More

How was this guide created?

The creation of this guide was a collaborative effort, beginning with a series of focus groups with DPC Members, designed to discuss and brainstorm the topic of digital preservation documentation. Much of the content of this guide articulates the ideas shared by Members in these sessions. Focus group participants also helped develop the text of the guide by commenting on drafts and providing case studies.

The DPC is grateful to the following Members for their participation in the focus groups and input into the development of this guide:

  • Matthew Burgess, State Library of New South Wales

  • James Doig, National Archives of Australia

  • Taryn Ellis, State Library of South Australia

  • Carey Garvie, State Library of Victoria

  • Bryony Hooper, University of Sheffield

  • Herve L’Hours, UK Data Archive

  • Kiki Lennaerts, Netherlands Sound and Vision

  • Rachel MacGregor, University of Warwick

  • Roxana Maurer, Bibliothèque nationale du Luxembourg

  • Kieron Niven, Archaeology Data Service

  • Laura Peaurt, University of Nottingham

  • Jaana Pinnick, British Geological Survey

  • Ziggy Potts, Art Gallery of New South Wales

  • Irina Schmid, American University in Cairo

  • Kristen Schuster, King's College London

  • Paul Stokes, Jisc

  • Lucy Wales, British Film Institute

DPC staff members Sharon McMeekin, Jenny Mitcham, Ellie O’Leary, Michael Popham, Paul Wheatley and Robin Wright were also involved in focus group discussions and/or the production and publication of this guide.

Participants at the first focus group session on 24th February 2023

Participants at the second focus group session on 6th March 2023

Read More

Introduction to Digital Preservation Documentation Guide

Documentation is important for any organization or project, and of course is also central to digital preservation good practice. It is necessary for moving forward with digital preservation maturity models such as DPC’s Rapid Assessment Model and the NDSA Levels of Digital Preservation. It is essential for providing evidence of processes when applying for digital preservation certification such as CoreTrustSeal or ISO16363. If you need further persuasion, a DPC blog post from Amy Rudersdorf of AVP provides a compelling list of reasons why we should all make time for documentation. 

Good documentation undoubtedly is good practice but it is not always given the attention it deserves. The OSSArcFlow project noted in 2020 in their Guide to Documenting Born-Digital Archival Workflows that: “the vast majority of today’s born-digital archiving activity is not well documented”.

There are many possible reasons for this. The OSSArcFlow project notes for example that: “most collecting institutions believe that their born-digital archiving workflows are still too ad hoc or nascent to deserve formal documentation.”

There is sometimes a reluctance to formalize processes and commit things to paper when they are still evolving, but there are many benefits to doing so (as discussed in ‘Why should we document?’).

It is a given that digital preservation practitioners need skills in creating documentation. The DPC’s Competency Audit Framework includes the skills around the production of documentation under the ‘Advocacy and Communications’ competency area and includes example tasks such as documenting procedures and workflows and producing technical documentation.

In 2022, the DPC heard from its Members that it would be helpful to be able to share experiences and tips on how to create documentation. In response to this, early in 2023 we ran a series of focus group meetings with the aim of facilitating knowledge exchange and an intention to gather the information shared into this online guide. 

Audience

This guide is aimed at digital preservation practitioners who must create, update and maintain digital preservation documentation as part of their work. It provides guidance on how to create and manage good documentation. An understanding of digital preservation concepts and processes is assumed.

Scope

‘Documentation’ in the context of this guide refers to documentation that is important for the day-to-day operation of digital preservation within an organization, for example recording how digital preservation tasks and procedures are carried out and how systems are integrated and configured. Documentation exists in the form of text and illustrations that can be shared with others - not as ideas that exist only in someone’s head! If a key member of your digital archives team unexpectedly left, would others be able to pick up their digital preservation work based on the available documentation? Would the operations of the digital archive be able to continue seamlessly and consistently? We focus in this guide on the types of documentation that would help to facilitate these goals.

There are however other types of documentation that are relevant to digital preservation, and these are considered to be out of scope for this guide:

  • This guide does not cover digital preservation policy and strategy documents or documentation relating to high level planning, reporting or resourcing. Note that guidance on writing a Digital Preservation Policy can be found in the DPC’s Digital Preservation Policy Toolkit

  • The guide does not cover documentation that helps to make individual records or datasets understandable.This may be the documentation that comes with digital content to enable others to interpret it, or documentation that is subsequently created by staff within the archive to enhance the usability of a particular dataset. 

  • Also out of scope is Preservation Metadata (metadata about events, agents, rights and objects as described in the PREMIS data dictionary).

Read More

Digital Preservation Documentation: a guide

DPC documentation coverThis guide provides advice on how to create and manage high quality documentation. It is aimed at digital preservation practitioners who create, update and maintain digital preservation documentation as part of their work.

In the context of this guide, ‘documentation’ refers to the documents that describe how digital preservation activities are carried out within an organization, for example recording a workflow or procedure, describing how to install and use a tool for a particular task or how digital preservation systems should be configured.

This guide discusses the importance of documentation, who it is for, and highlights some of the features of good and bad documentation. It goes on to provide some tips on creating documentation, including some of the tools or platforms available. Review and update of documentation is discussed, as are requirements for long term preservation.

This guide builds on the work of numerous members of the digital preservation community and brings together many good examples of digital preservation documentation. Our thanks go to all those DPC Members who engaged with discussions around documentation either through sharing their ideas in our focus groups, writing up or presenting case studies, or reviewing the text of the guide. Find out more about how this guide was created here.

icon signpost

Introduction 

An introduction to this guide, including audience and scope

DPC icons questionmark

What, Why and Who of Documentation

What should be documented? Why is it important? Who is it for?

DPC icons lightbulb

The Good and the Bad of Documentation

What does good and bad documentation look like?

icon tools

Creating Documentation

Tips for creating documentation, including tools, use of templates and style guides and the importance of testing

icon clipboard

Revising and Maintaining Documentation

How to update, manage versions and communicate changes

icon question

Preserving Documentation

Does your digital preservation documentation need preserving for the long term?

icon case

Interviews and Case Studies

A selection of written interviews and presentations from community members discussing their documentation practices

icon box

Further Resources

Additional sources of information on documentation and helpful examples


 

Suggested citation of current version

Digital Preservation Coalition (2023). Digital Preservation Documentation: a guide [http://doi.org/10.7207/documentation23-01]

Last updated

September 2023

Date of next planned review

2027

 

Do contact us with comments and feedback to help us improve this guide.

Thanks to Tom Woolley for the illustrations.

 

Read More

How this resource was created

Background

The idea of creating a Digital Preservation Policy Toolkit came from the University of Bristol’s request for help with creating their own organizational digital preservation policy. Rather than creating just one policy for one member, the DPC team decided to create a Preservation Policy Toolkit which would become available for all Coalition members. The Toolkit would be tested ‘on the fly’ whilst it was being developed, by staff from Bristol who would create a specific policy for their institution and feed back comments and suggestions to help refine the Toolkit.

The structure and content of the Toolkit was created in a three-day workshop that followed the format of a Book Sprint. The Sprint drew on an array of experience and resources that were gained throughout the SPRUCE Project as well as from the collaborative development of resources including the Digital Preservation Handbook and Executive Guide on Digital Preservation.

Each member of the Sprint team started by offering characteristics of their favourite digital preservation policies, and the example policies and discussion that came out of this session led to a clear direction of travel for the Toolkit. Content was created using successive iterations of brainstorming, writing down ideas, fleshing out text and multiple peer review in order to reach the end result. Google Drive proved to be an effective collaborative platform to develop the content, enabling multiple authors to work within the same document and the comment and feedback loop to occur rapidly. This blog post provides a more detailed description of the event.

Versions and history

Version 1 of the Digital Preservation Policy Toolkit was released to DPC Members in April 2020.

Version 2 of the Digital Preservation Policy Toolkit was released to the whole digital preservation community in March 2023. Revision and update was carried out by staff at the DPC and efforts were primarily focused on updating the template to refer to more recent policy examples. There is more information about how and why the toolkit was updated in this blog post.

Acknowledgements

This toolkit was created through the collaborative efforts of Stephen Gray, Emma Hancox, Debra Hiom, Hannah Lowery and Julian Warren from the University of Bristol; William Kilbride, Sarah Middleton, Jenny Mitcham and Paul Wheatley from the Digital Preservation Coalition; and invited experts Adrian Brown (Parliamentary Archives), Neil Grindley (Jisc), Edith Halvarsson (Bodleian Libraries, University of Oxford) and Natalie Harrower (Digital Repository Ireland).

Thanks are owed to the University of Bristol for funding and hosting the Toolkit Book Sprint, and in particular for feeding the Sprint team!

Thanks go to Joost van der Nat at DDHN for translating the latest policy work at the DDHN for us, which was fed into the melting pot of information from which we built the Toolkit.

Thanks to Tom Woolley for the lovely illustrations - it was great to work with Tom again, after we last worked together on the Digital Preservation Business Case Toolkit as part of the SPRUCE Project.

Thanks go to Martin Klein and Herbert Van der Sompel for advice on using Robust Links and to Colin Armstrong for implementing them in version 1 of this Toolkit.

And finally, a big thank you goes out to all the authors of existing resources on this subject which we have attempted to distill, enhance and build into this Toolkit and those who have made their policies available online for others to access and learn from.

Read More

Reviewing Your Policy

Information to guide future work on keeping your policy current and relevant. Use this to find out how to keep your preservation policy up-to-date.

A lot of work typically goes into developing and communicating a digital preservation policy. Its final approval and publication is cause for celebration within an organization but that does not mean your work is done. Like any other policy your digital preservation policy will need to be reviewed periodically and kept up to date to ensure its continued accuracy and relevance for your organization.

 

Frequency of review

Policies need to be reviewed at regular intervals, to ensure that they are reasonable, and that they remain current in terms of practice. When you are establishing your policy, you should develop a schedule for review alongside it that notes a timeline for review, and potentially the roles or personnel who will undertake the review.

When developing your review schedule, consider:

  • If your policy needs to be reviewed by any other internal or external parties; if so, make sure they are named and notified as part of the review scheduling.

  • Whether the policy needs to align with any higher-level policies or guidelines, either from organizational strategy documents, funder requirements, or legislative mandates (this may inform your review schedule).

  • Whether there are any documented processes that may change over time and need to be amended in the document.

  • Whether there are any new changes which need to be reviewed by particular committees/offices before they are accepted, and what institutional timelines need to be considered in your review schedule (e.g. does a policy oversight committee or similar meet at certain intervals?).

Digital preservation processes and technologies change quite rapidly, so a reasonable timeframe for a policy review is 3 to 5 years. Longer time frames may mean that the policy is out of date before it has been reviewed, while shorter time frames could be redundant or organizationally burdensome. This timeframe should inform the level of granularity covered by the policy: you should not include very specific details about, for example, the choice of a particular preservation format, if these details are likely to change frequently.

It is easy to set a schedule for policy review but not always so easy to stick to it! There are many 'live' digital preservation policies online that are several years out of date and where the published review date target clearly hasn’t been met. It is important therefore not just to note the review schedule on the document itself but also in your organizational calendars!

Remember though, that you don’t have to wait until your agreed time before you carry out a policy review. A policy can be reviewed at any point and you should certainly try and do so if you feel it has become out of date. It is better to review it earlier than expected than later!

A policy review exercise doesn't always need to be too onerous a task. The benefits of a frequent review cycle may be that only minor changes are necessary. If a policy hasn't been reviewed for some time, it is more likely to be a larger piece of work to review and update.

 

Case studies

There is an interesting case study about policy review published as a blog post from Adam Harwood from the University of Sussex. As he explains, his organization reviews and refreshes preservation policy annually and the commitment (and reminders) to do so comes from senior management.

Another case study on the topic of policy review comes from Martin Gengenbach of the National Library of New Zealand who discusses some of the risks of lapsed policy documentation and the importance of relationship building when working in this space.

 

Keeping track of previous policies

As part of the process of policy update and review you should also consider what will happen to the superceded policy document. It is good practice to retain and preserve a copy of any previous preservation policy documents alongside the digital content you are preserving. Future generations of digital archivists (and indeed users) may find it helpful to see the evolution of digital preservation policy and a full history of policy documentation will provide valuable information to facilitate an understanding of decisions and approaches with regards to digital content at a certain point in time. Some organizations maintain this information internally (for example as described by Adam Harwood in his blog) and others have already taken steps to provide this information to their users.

 

Read More

Policy Principles - Discovery and Access

This section covers policy statements around the discovery of digital content and how access is provided for users. Policy principles may include a statement of the commitment to access along with high level principles that will guide this work.

University of Sussex

The library will control access to certain digital content to comply with the Data Protection Act (2018) and any other relevant legislation.

Free public access to digital content will be provided online where possible, or in controlled conditions in The Keep reading room when appropriate.

University of Sussex Digital Preservation Policy (2022)

University of St Andrews

Staff, students, external researchers and other users will be provided with appropriate access to digital content now and in the future.

University of St Andrews Digital Preservation Policy (2021)

University of Cambridge

Access is fundamental to preservation. Not only does access enable readers to carry out their research, it enables them to alert library staff to any potential risks to collection materials (e.g., do files open and render as expected?) and helps them identify whether further action is necessary.

This principle will be followed by:

  • Providing as wide access as possible to staff and readers in ways that faithfully represents digital objects when they were created or acquired.
  • Periodic manual sampling to ensure that files can open and to visually check for any potential issues.
  • Communicating to staff and readers the conditions under which digital collection materials can be accessed, used, and re-used.
  • Providing a mechanism by which staff and readers can alert staff to access-related issues so that action can be taken.

Cambridge University Libraries Digital Preservation Policy (2021)

Wellcome Collection

Open access: We recognize that the purpose of digital stewardship is to facilitate access, and that access should only be restricted in line with the legal and ethical considerations outlined in our Access Policy.

Wellcome Collection Digital Preservation Policy 2019–2021 (2019)

National Archives of Australia

Digital records and digitized records will be delivered to users in ways that do not limit the use and re-use of the records' content, for example a download of a digital record in its original format, or a download of an unsecured pdf access copy of a digitised record.

National Archives of Australia Digital Preservation Policy (2020)

Read More

Policy Principles - Metadata Management

This section covers processes to create and maintain sufficient metadata to support preservation, discovery and use of preserved digital content. Topics covered in policy statements might include persistent identifiers, metadata to support rights or provenance of digital content, other types of metadata and standards.

British Geological Survey

‘Preservation metadata is metadata that supports the distinct requirements of digital preservation: maintaining the availability, identity, persistence, renderability, understandability and authenticity of digital objects over long periods of time.’ Lavoie and Gartner, 2013, p.2). The preservation strategy describes the processes for preservation metadata capture and the use of permanent identifiers and DOIs.

Authenticity: The data is what it purports to be, is created or sent by the purported person, and at the purported time. This is shown e.g. in the provenance of data and preservation metadata.

British Geological Survey Digital Preservation Policy (2020)

University of Glasgow

In order to support the authenticity of digital resources, detailed metadata will be captured and maintained along with the digital record.

All actions undertaken as part of the preservation process (including any new metadata created as part of the process, e.g. preservation activity, custody/ownership) must also be fully documented in the preservation metadata associated with each record, to provide an audit trail.

A preservation metadata model will be defined and published.

University of Glasgow Digital Preservation Policy (2020)

University of Sussex

The library will ensure the reliability of digital content by documenting contextual information about an object in its metadata, which will be saved with it in the repository. We will make this metadata available to researchers who access the digital content.

The library will keep a permanent record of the provenance of digital content. Standard submissions processes will be used to gain control of new digital content and the lifecycle of digital content will be captured and stored in its metadata.

University of Sussex Digital Preservation Policy (2022)

National Archives of Australia

The record must be trusted as an accurate representation of the original record. The Archives will ensure authenticity through the operation of transparent and fully documented preservation strategies, and by capturing and providing the metadata required to describe the content, context and provenance of the record.

Preservation processes that result in any physical or logical change to a digital object will be logged and recorded in the associated metadata, to provide an audit trail. All changes to metadata will themselves be audited.

The relationship between any digital object and its metadata be maintained persistently. A persistent, unique identifier will be assigned to every digital object at the point of ingest, and the recording of this identifier within the associated metadata to provide a persistent link.

National Archives of Australia Digital Preservation Policy (2020)

Read More

Scroll to top