![]() |
![]() |
![]() |
|
|
DPC Featured ProjectThe aim of this series is to highlight some of the DPC member projects listed in the DPC Members' Projects Table A project will be highlighted on the Home Page of the DPC website every 6-8 weeks and older versions will be retained in the table of DPC Member projects. The sixth interview was carried out in January 2007 between Najla Semple
and Hilary Beedham of the UK Data Archive. Hilary is also a member of
the project team of the East of England Digital Preservation Regional
Pilot Project [DARP] which was published as a report, available at: Published 02 March 2007 East of England Digital Preservation Regional Pilot ProjectWhat were the principal aims of the report? The project was the next step following two pieces of work instigated by the East of England Regional Archives Council (EERAC). The first of these was the report Eastern Promise: a strategy for Archival Development in the East of England which was published in spring 2003. This was followed by a survey of regional archives which was undertaken by the UK Data Archive (UKDA) on behalf of EERAC and MLA East of England. The latter was an attempt to assess the progress that regional archives were making in digital preservation. What sort of organizations were involved in the creation of the report and why did you target them? We acknowledged that the work would potentially be of interest to all archives as evidenced by the fact that MLA East of England is a key partner. The most pressing needs were considered to be those of larger local authorities who not only produce a large number of records themselves but also make available the archives of other people and organisations via record offices and for this reason it was decided to work with Bedfordshire and Hertfordshire County Councils (BCC and HCC) based on their responses to the survey and their clear interest in participating in the project. In particular, the responses from HCC demonstrated that they were well placed to participate, having already identified a need for a digital preservation service. What were the practical elements of the pilot? The most practical elements of the pilot were the production of the sample deposit form which we included at the end of the report plus the examination of electronic media provided by HCC and the sometimes unsuccessful attempts to read data from these... more on this later! To what extent was there a confusion in terminology – archivists vs. digital preservation experts? Terminology proved to be the least expected but greatest challenge and it was only after a number of meetings that we all began to realise that it was causing problems. Once identified, however, we were able to take practical steps to remedy it. Typical of the difficulties it caused are the definition of ‘records’ and ‘archives’ which is why we devoted a page to this at the beginning of the report. The term ‘record’ presented a particular problem since it has a very specific meaning for an archivist and records manager but for the digital archivist working with databases (structured information), the meaning is quite different. At the time this project began, the UKDA was also working on a project with The National Archives (TNA) to assess UKDA and TNA compliance with the Open Archival Information System (OAIS) reference model. Interestingly, terminology was also of relevance to this work because our organisations sometimes referred to similar processes but used different terms for them. The application of jointly understood OAIS terms removed any confusion. In addition, TNA found OAIS terms useful for interdepartmental discussions as terminology sometimes differed between departments. We did introduce some OAIS terms to the DARP discussions and the most important of these are flagged in the glossary to the report. However, because of the timescales involved, we limited its use because of the need for significant rewording if was to be usable. What file formats did you include in the study – all digital files? Yes, we only looked at digital files. After the materials had been transported to UKDA, an initial examination quickly revealed that the original media been stored in a far from perfect way - disk covers were covered in dust; some had post-it stickers attached to them. This highlighted one of the serious issues that will be faced by anyone embarking on the preservation of digital records. The records we dealt with were only available ‘as collected’: they had not been copied onto any other system or media and they had not been placed into an environment where they were safely stored and accessible for further work. It shouldn’t be forgotten, though, that we quite deliberately considered old and obsolete media because we expected that attempts to read these media would allow us to produce documentary evidence of the problems we wanted to highlight. So, for example, we looked at a number of tapes, punch cards and 5 inch floppy discs which we expected to be able to read but perhaps not understand. We also looked at discs and cartridges. One of the former remains a bit of a mystery – UKDA systems staff have a total of about 70 years experience reading digitised files but nobody had previously seen a 12 meg. disc cartridge. We subsequently passed this, plus some of the other media we couldn’t read in-house, to a specialist service for which we pay once the data have been read – as yet we’re still waiting for a bill! The punch cards proved interesting as they showed that unless one records what information they contain, one can’t necessarily make good judgements as to whether they are worth keeping. Those we were given proved to be job control cards relating to a dataset that would have been stored elsewhere in the system, perhaps on additional punch cards. If the information on these cards was of value, one might assume that it would only be in the context of the additional, missing data media. Can you say something about the methodology followed? The methodology for the practical work was simple. We undertook a visual analysis of the media which gave us an indication of the condition of the medium and, where there was written labelling, various types of information was gleaned, for example, in one instance an individual was named and in others the software used to write the file was identified. Where possible, we then read the data from the original medium and transferred them onto our own system from where, depending on the content and the level of associated information in the files, we were able to reproduce the original information. Of the twelve media supplied, we were successfully able to prepare the data on four to a preservation standard. In one case, we decided that the medium was faulty and the cartridge, a 3M DC 600A, was not sent to the specialist company. We did however, send a 5 inch floppy BASF flexydisc and they confirmed that this disc was also faulty. In several cases, we were able to extract data from the media but did not have the additional information needed to understand the structure of the files and of course, without this information, the data are of little value. One simple thing we hadn't planned was to show these files, on screen, to our project partners. The exercise was really useful because it helped our partners to understand that even where you can read the characters on a screen, it's impossible to make real sense of them without additional information. This part of the work demonstrated clearly that there needs to be an investment in the management of digital records to ensure that their content is known before a decision can be made as to whether they should be preserved. What sort of questions did you ask participants? The questions we asked were framed to fill the information gaps that prevented us from successfully reading and understanding the files on the various media. They were similar to the questions we routinely ask of those who deposit data at the UKDA but were not asked by local authorities with no experience of handling the structured digital databases with which the UKDA is so familiar. The local authorities found it useful to incorporate and adapt processes appropriately to their needs. For example, deposit forms have been revised to seek more information about the digital records they expect to accession in future. A sample form is included at the end of the report and we have had feedback from readers to say how useful this will be for them. In addition, both the local authorities we worked with recognised the need to compile a 'history' of their computing systems as an aid to reading their own digital records in future. What were the digital preservation findings? There is a potentially a very serious problem with legacy material. The fact that only 25% of the media we looked at were able to be preserved is alarming. It’s no use just storing data on digital media in what are thought to be safe environments - some of the media we worked on were damaged despite their having been placed in protective plastic cases – and assuming that they can be prepared for preservation at some future date. The longer these files are neglected, the less likelihood there is of their being useful in future. As we hope the case study on page 24 of the report shows, if digital records are to be preserved, action has to be taken at an earlier stage than for paper records. It’s essential that key information is collected and retained and a base level of work is undertaken as soon as possible after a digital record is created, otherwise records will either be lost or will only be recoverable with great effort and not insignificant expense. A second finding is related to the first. As more and more records are born digital there is an immediate need to put procedures and systems in place to ensure that newly created records do not become part of the legacy problem. Both of the above findings point to a further finding that there is an urgent need for some form of archival service or system for the regions. Finally, there is effort and therefore cost involved in making digital records available for public access and retrieval and anyone embarking on a programme of work to preserve such records will also need to consider how to satisfy these additional needs. What were the costing findings and what were these for, how were they calculated? We believe that this is the first work of its kind to actually provide a real indication of the costs of long term preservation. The figures provided are the result of identifying the costs of setting up and running a regional digital repository based on facilities comparable with those of the UKDA. They include the infrastructure costs and also staff costs which were calculated using information from the UKDA’s internal recording systems that allow us to estimate the amount of staff time spent on different tasks. Based on these costs, we estimated that an organisation would need an investment in the region of £525,000 to establish an equivalent preservation service and run it for five years. This represents a significant investment and it was noted that it could be more cost effective to ‘buy into’ an existing organisation so we also undertook a theoretical exercise to work out how much it would cost an organisation to ‘buy into’ the existing UKDA preservation system. To cost this, we took fixed costs, e.g. the machine room requires a stable, humidity and temperature controlled environment with anti-intrusion and fire protection systems plus staff costs and used these to estimate the cost of renting space on our system. Several assumptions were made, the details of which can be found in the report but we estimated that it would cost £35,500 to ‘rent’ 1 gigabyte of space. It’s perhaps worth mentioning that a couple of readers have had difficulty understanding the idea of a fixed overhead for space and took the figure of £35,500 as meaning that renting two gigabytes would double the figure. This isn’t the case, the report's intention is to emphasise the fixed overhead cost of any space in a preservation system. We simply used 1 gigabyte because we thought it was an amount that most readers would be able to relate to. Can you mention the two cost models that you came up with in the report? The cost models we presented were chosen to fit the needs of two types of organisation, the first, larger organisations such as county councils that have well-developed in-house systems for the maintenance and preservation of digital records and the second for smaller organisations that lack the in-house facilities needed to ensure long-term preservation. Details of both models, including the individual cost elements, can be found in the report. Can you talk about creating a business case for institutions – how important is this for institutions? It’s going to be really important for institutions to create a strong business case for this work. The problem is growing by the day and the costs of resolving it are likely to be significant. The National Archives Council report, ‘Your Data at Risk’ identified three possible solutions and we considered each of these in the context of this project. It was felt that there is likely to be a need to contract out at least part of the process and we hope that this report, with its practical demonstration of the problems associated with delay plus the two cost models and some hard figures will make it easier for organisations to build their business case. As the report points out, the market in this field is currently underdeveloped but we can expect this to change whilst at the same time, records managers face the challenges of making their own records comply with the Freedom of Information Act and with corporate retention requirements. Building a business case for this may be relatively simple. However, extending the business case to external digital archives will be more challenging for local authorities and is likely to raise broader questions about the value of community archives, something which is being addressed by the Community Archives Development Group. Finally, would the establishment of a regional repository be feasible and what would the challenges be? It is hoped that this report will not mark the end of this collaboration. The report has been well received and it would be a shame if, having created a momentum, work was to stop at this point. MLA East of England and the UKDA are keen to take next steps and are investigating the possibilities for a new project to test the feasibility of setting up a regional archive. Undoubtedly there will be challenges; the first will be to gain funding to take the work forward. As a group we are looking at this actively and are cautiously optimistic. No doubt there will be other challenges, some, like the issue of terminology faced in this project may not be anticipated and will have to be resolved as they arise, others, such as the need to raise awareness of the importance of storage conditions for media, will also have to addressed. |
|||||||||||||||||||||||||||||||||||||||||||
HOME | ABOUT | MEMBERSHIP | MEMBERS'
PROJECTS | ADVOCACY | FORUMS/MEETINGS DPC GUIDES | HANDBOOK | WHAT'S NEW IN DP | REPORTS | DIARY | DP AWARD | LINKS | CONTACT |
|
DPC is a company limited by Guarantee. Our Company Number is 4492292 |
|