![]() |
|
|
DPC Featured ProjectThe aim of this series is to highlight some of the DPC member projects listed in the DPC Members' Projects Table The fifth interview, carried out in August 2006, sees Susan Thomas at Oxford University, talk to Kieron Niven about the Paradigm project. Published 20 September 2006 Paradigm ProjectHow did the Paradigm project come about? The Paradigm project emerged at a time when the creation and provision of digital materials was increasing dramatically and future trends indicated that the demand for these materials would only grow. There was a great deal of research focused on the acquisition, creation and provision of ‘published’ digital materials, such as databases, e-journals and academic papers, but there was much less in relation to the preservation of these resources, and research into the acquisition and preservation of personal digital material did not seem to be happening at all. The democratisation of computing has resulted in a growing quantity of important and everyday personal digital archives. Major research libraries, such as the Bodleian and the Rylands, have long acquired the personal archives of significant individuals, such as politicians, writers and scientists. These archives are important components of the collections of both institutions, and both libraries have a great deal of expertise in managing all aspects of the curation and preservation of these materials in analogue forms. Existing digital preservation research told us that digital materials had very different preservation requirements to analogue materials, and that we could not expect these materials to persist by storing them in optimal environmental conditions, as we endeavor to do for their analogue equivalents. Although there was theory available to inform us, we felt that there was limited practical experience that we could draw on to develop policy, strategy and practice for the treatment of personal digital archives. Most repositories were developed for more uniform collections and were focused on providing access. For personal archives, we needed a closed repository for the preservation of an infinite variety of digital materials. We needed to develop the skills, experience and infrastructure to manage digital materials, or the personal archives that provide personal insights into our times would suffer a reduction in quality, because we would be unable to accept their digital components. Institutional concerns at the Bodleian and the Rylands co-incided with a trend towards developing ‘Institutional Repositories’ – systems designed to manage digital information created or owned by universities. JISC’s interest in this area provided us with a funding opportunity to explore relevant issues and to develop expertise. Specifically, we applied for funding under a Programme on Institutional Digital Preservation and Asset Management, which was designed to give institutions the opportunity to gain hands-on experience in digital preservation. The Paradigm project was conceived to address issues associated with personal digital material in the context of existing collecting and curatorial responsibilities by working with creators of personal digital archives and their collections. What are the project's primary aims in terms of digital preservation? The project’s primary aim is to enable the Bodleian and the Rylands to discover what preserving digital archives means at first hand. Paradigm is a pathfinder project for both institutions. It will be a building block for our future activities in this area and has allowed us to begin implementing some best practice now. By working with real creators of personal archives and their materials, we have developed a better understanding of the issues that are important to the individuals whose papers we collect, and how different collecting strategies might affect our relationship with them and the resulting archive. We have also learned something of the scope and content of the materials that we might expect older, contemporary and even future personal digital archives to contain. By applying digital preservation tools and standards to traditional archival workflows we have been able to practice key events in the lifecycle of digital records, such as accession and ingest, and now have a better understanding of the skills, infrastructure and processes that we need to administer and preserve digital archives. All this experience gives us a platform for developing the capacity to continue selecting, taking-in, managing, cataloguing, and making personal archives available to researchers. The project partners appear to have determined the two political parties involved in the project but how were the individual politicians selected? We worked with three political parties in the end: the Conservative Party, the Labour Party and the Liberal Democrats. We originally thought we would work with members of the Conservative and Labour parties, because the papers of the parties themselves are held at the Bodleian Library and the People’s History Museum in Manchester. Selection of individuals was determined in two ways. Firstly, we wanted to work with individuals who fitted with the collections of our institutions and our colleting policies. Secondly, we wanted to represent a number of variables. Our academic advisory board were especially keen that the project should reflect a range of political roles and interests: ministers, MPs, peers and MEPs with local, national and international interests. This spread across institutions and parties has also helped the exemplar to represent the diversity of working methods in particular settings and the impact of individual attitudes towards IT, recordkeeping and long-term archiving for historical purposes. It has also provided different technical challenges, as the archivists have had to adapt to various technical set-ups. In terms of data collection who decided what data to archive: the depositor, the repository or combination of both? A combination of both. The depositor always has the final say in what they are willing to give to an archive, although the archive may refuse to accept it. The project specifically wants to deal with as many personal archive data types as possible in this exemplar, so we did not impose any restrictions on the basis of format, only on the basis of potential historical interest. What was deposited, and how often, has varied from depositor to depositor. This is partly dependent on what records are created in the first place, and the project’s records survey helps the archivists to establish this. We must also agree what records the depositor is willing to deposit and when. Some records have a very short active life, and depositors are happy to deposit copies of them sooner. Other records are used for much longer, or are much more personal. Many creators prefer to keep these records themselves, but may deposit them in the future. Some depositors wish to weed archives prior to deposit, which can significantly delay their accession. More work needs to be done in this area over a longer period of time. Curator-depositor relationships are very personal and often they are sustained over many years before a collection, or the entirety of a collection, is deposited in an archive. The length of the relationship provides an opportunity to build trust and understanding of the archiving process. With regard to accessing the archives, how is this to be done? Will it be restricted or delayed/embargoed? What are the legal or privacy implications? The majority of the archives accessioned by the project are acquired on short-term deposit for the purposes of the exemplar and there has never been any intention to make these available in the time frame of the project. Those archives accessioned as part of our permanent collections will be subject to similar access negotiations as paper deposits. Legal issues, such as data protection, intellectual property rights, defamation, etc., will be identified and addressed, as will more individual concerns expressed by the depositors themselves. We must also protect the interests of the several third parties represented in personal archives. Archivists obtain personal digital archives for the long-term. We expect to be preserving and providing access to these materials indefinitely. Sometimes we must apply embargoes in order to accession an archive at all. The embargo may frustrate some researchers, but as long as the depositor agrees that we can provide access to the archive at a reasonable date in the future, this measure can save an archive that would otherwise be lost to posterity. Susan Thomas recently gave a presentation at a DCC email curation workshop. How problematic have you found archiving email to be in comparison to other digital objects? Email is a complex digital object. It is complex technologically because despite the existence of a single standard for the exchange of email (RFC 2822), most clients actually store email in their own, often opaque, native format. The preservation-unfriendly .pst (personal store) file format, for example, is widespread and difficult to deal with. Email clients and online email services do not make it easy for users to archive their email either. Many include import and export functions to move email between clients and services, but simple archiving tools which would allow users to extract the raw email and store a backup elsewhere are absent. Even where individuals do backup their email, the process is time consuming, arduous and poorly understood. The variety of attachment formats associated with email accounts also present a challenge. To preserve email, we need to break it down into its component parts so we can understand what range of formats we have to preserve and what the relationships between the different components are. These relationships come in many forms: threads, hierarchical filing, labeling, email/attachment relationships, etc. Much of the work on email preservation to-date has been undertaken in a records management environment, where plug-ins have been added to email servers or clients to generate XML versions of email or to collect metadata. For collecting archivists this solution is not practical. We cannot enforce solutions on individuals, or expect them to undertake additional steps. Until software companies provide end-users with more preservation-friendly extract and archiving tools, a better solution for curators is to process email archives extracted in opaque formats into preservation friendly formats, using tools such as the National Archives of Australia’s Xena. Email is also one of the more complex objects from a social and legal standpoint. It often contains opinions and statements that could be defamatory if published, or personal data relating to many people, as well as several kinds of intellectual property right belonging to various individuals or organisations. The way people use and arrange their email account(s) varies enormously too. Many people do not ‘manage’ their sentmail at all, and some methods used to manage incoming mail can make appraising or describing the email manually virtually impossible. The difficulty that many users experience in navigating and deleting email is widely acknowledged; this results in a great swell of content that presents the accessioning curator with a number of challenges: what bandwidth, storage capacity and time will be required simply to extract the email? What impact will this volume have on the design of repositories designed to preserve and deliver email archives to end users? One of the project's aims is to compare DSpace and Fedora, how have you gone about doing this? Have you established a set of criteria as a basis for comparison? We have not yet finalised our criteria for comparison. In framing it we are considering the RLG/NARA audit criteria and the OAIS model itself. Part of the problem in devising an assessment is determining which aspects of policy and procedure should be mechanised, and whether they should form part of the repository software’s responsibility, or that of an external service or human process. Some important parts of the preservation process, such as the building of the Archival Information Package, do not take place using software such as Fedora and DSpace, and some crucial preservation activities, such as obsolescence monitoring and connections to file format registries, have yet to be implemented in either software. The Workbook is an excellent way of documenting the project's progression and presenting the results. Has this been an easy thing to produce and has it had an effect on the way the project has been carried out? Documenting results in this way is time consuming and affects how much can be achieved, but the process provides a way of reflecting on the work as we move along. All the core project team were new to digital preservation and as people who had to start from scratch ourselves, we wanted to make it easier for others with limited knowledge and experience to engage with digital preservation in a practical way. At the start of the project, we found a great deal of information that we could read, but practical guidance was lacking. We have also found that fusing archival and technical experience in addressing the Workbook topics has helped the project team to develop a better overall understanding of digital curation and preservation. We also think it helps to provide a knowledge base of real-world examples as well as pointers to templates, software and techniques that others can discover too. We do receive feedback and interaction because of the Workbook, partly because the format helps us to provide more detailed information than is possible in a presentation or theoretical paper. Practitioners do use the workbook and get in touch about particular issues, and often we learn about new ways of doing things as a result. It helps us to help them. The project archivists also use it as a remote self-help tool when working with creators. The project was funded by JISC for two years and is due to finish in February 2007 . Has this been a suitable length of time in which to carry out the project or would you have liked to spend more time exploring certain aspects or archiving more data? It is something of a contradiction to work on a short-term project on long-term preservation. We have learned a great deal in the time we have spent, but starting from scratch was challenging and learning the basics and acquiring sample collections absorbed much of our time in the early phase of the project. While digital preservation experts are so scarce, three year projects might be more appropriate. Despite these difficulties the project has given us an opportunity to undertake very useful work supported by the expertise and the resources of the JISC. We will build on this to develop the expertise we need to continue collecting personal archives. |
|||||||||||||||||||||||||||||||||||||||||||||||||||
HOME | ABOUT | DPC EVENTS | TRAINING | DP AWARD | WHAT'S
NEW IN DP
HANDBOOK | REPORTS | PRESS COVERAGE | LINKS | CONTACT MEMBERSHIP | MEMBERS' PROJECTS | MEMBERS' LOGIN DPC is a company limited by Guarantee. Our Company Number is 4492292 ©
Digital Preservation Coalition 2002
Copyright and Disclaimer |