DPC

Library of Congress and DPC sign agreement

Added on 23 June 2004

DPC signs Memorandum of Unverstanding with the The Library of Congress

ndiipp

The National Digital Information Infrastructure and Preservation Program of the Library of Congress (NDIIPP) was established after Congress gave approval to the Library of Congress to develop the program in December 2000. In January 2004, Congress approved the Library of Congress's plan for NDIIPP, which will enable the Library of Congress to launch the first phase of building a national infrastructure for the collection and long-term preservation of digital content. Funds released will allow testing various technical models for capturing and preserving content.

Read More

Digital Preservation: the global context

Report on the DPC Forum held at the British Library Conference Centre, Wednesday 23 June 2004.

The 8th DPC Forum attracted the biggest audience to date for a DPC Forum. Around 100 delegates were kept interested and informed by a very rich programme with presentations from several experts from the U.S and Europe. One key theme running throughout the day was the need for active collaboration at every level and across sectoral and geographic boundaries. Speaker after speaker illustrated how this collaboration was essential. Other consistent messages were the importance of trust (between partners and stakeholders in the emerging technical infrastructure), the need to find effective mechanisms to distribute responsibilities, developing standards and tools and above all, the need to develop and share practical experience.

speakers040623

Speakers from the 8th DPC Forum, Digital Preservation: the global context
L to R Taylor Surface, OCLC; Robin Dale, RLG; Nancy McGovern, Cornell; Peter Burnhill, DCC; Seamus Ross, HATII;
Eileen Fenton, Ithaka; David Seaman, DLF; Vicky Reich, LOCKSS; Tony Hey, eSCP; Laura Campbell, Library of Congress

Delegates received a sense of the broad range of activities going on, the progress that has been made, and the increasingly compelling need to accelerate progress. Feedback from the Forum, both formal and informal, has been overwhelmingly positive and is indicative of the consistently high quality of the presentations and a stimulating and thought provoking programme.

On the evening before the Forum, there was the presentation of the annual Conservation Awards, which included the inaugural DPC Award for Digital Preservation. This was won by the National Archives, for their Digital Archive. The CAMiLEON had received a special commendation. This event was regarded as another stepping-stone on the way to raising the profile of digital preservation. The Forum was equally important, bringing together people from all over the world, recognising the need for international collaboration, noting that no one can do this on their own.

Lynne Brindley, Director of the British Library and Chair of the DPC Board, chaired the day and noted in her welcome the importance the DPC placed on international links and the need to ensure that digital preservation issues are increasingly on the political and policy agendas. The DPC is committed to making practical progress and to sharing best practice through its membership.

Ms Brindley introduced the first speaker, Laura Campbell, Associate Director for Strategic Initiatives at the Library of Congress, who provided an up-to-date picture of what the Library of Congress was doing through their NDIIPP (PDF 772KB) (National Digital Information Infrastructure and Preservation Program) program. Ms Campbell described NDIIPP, which developed as a result of a report commissioned by the Library of Congress to assess whether it was prepared for the 21st century. Much experience in digital technology had come from building their Digital Library and they had learned the power of digital surrogates as well as their vulnerability to loss.

The $US175m plan consisted of $5m approved by Congress to produce a plan, $20m upon approval of the plan, and a further $75m which would be contingent on obtaining matching funds. Scenario planning helped show how a distributed effort might operate.

Key lessons and messages of NDIIPP to date include the belief that there will never be a single right way of doing things, so the architecture needs to be sufficiently modular and flexible to take account of this, the need for a distributed and decentralised approach and the need for new tools and technologies. NDIIPP needs to build partnerships and networks and then create a technical infrastructure to support the partners. Partnerships already forged included an alliance with the DPC, helping to establish the International Internet Preservation Consortium (IIPC), business model partnerships such as subscription and archiving services for e-journals, and technical partnerships, taking full advantage of the skilful technical talent which exists.

The next stage of NDIIPP would include testing architectures to support archive ingest and handling. In summing up, Ms Campbell indicated that during the next five years, the intention was to form a range of formal partnerships, encourage standards for digital preservation, establish a governing body, and make recommendations to Congress for funding.

Seamus Ross, Director of HATII (Humanities advanced Technology and Information Institute), provided an introduction to ERPANET (PDF 1087KB) (Electronic Resource Preservation and Network), the European Commission funded project which has brought together partners from Italy, the Netherlands, U.K and Switzerland. ERPANET has created a number of resources, organised seminars on several key topics, carried out an analysis of relevant literature and developed other tools, such as business cases for digital preservation and off-the-shelf policy statements. It was stressed that a lot of expertise already exists but there is a pressing need to bring it together and to work together.

Lessons learned were that the digital preservation community needs practical case studies and reports of "real world" experience. Simple tools for costing digital preservation exist but much more work needs to be done here. Guidance on digital repository design is also needed. ERPA E-prints (a repository of digital preservation papers and reports) is growing very slowly and needed to be marketed better. ERPANET have negotiated with the Swiss National Archives to preserve material held in this repository in perpetuity. In summing up, Dr Ross emphasised the great need for knowledge sharing so ERPANET events and DPC Forums were extremely important in helping to raise the level of awareness and understanding.

Presentations from David Seaman (Director of the Digital Library Federation); Robin Dale (Program Officer for the Research Libraries Group); and Taylor Surface (OCLC), described the work their organisations are doing in developing practical, collaborative tools, all of which will play a role in increasing trust in the developing infrastructure for digital preservation.

David Seaman's presentation, 'Towards a Global File Format Registry' (PDF 67KB) described the developing global file format registry, which is responding to an immediate need. The importance and value of linking to other relevant work, such as the National Archives' PRONOM system and the DCC (Digital Curation Centre) in the UK, was also stressed.

The title of Robin Dale's presentation, 'The Devil's in the Details- working towards global consensus for digital repository certification' (PDF 77KB), aptly summarised the challenge of articulating and reaching broad consensus on what elements and what process can be put in place to certify digital repositories against a commonly understood standard.

Taylor Surface described the work of OCLC's Digital Collections and Preservation Services in 'the OCLC Registry of Digital Masters' (PDF 464KB), which arose from a DLF Steering Committee recommendation. Taylor described how the registry linked to the OCLC's WorldCat service to provide enhanced discovery , encourage use of standards and limit duplication of effort of digitisation initiatives.

During the lunch break, Lynne Brindley and Laura Campbell signed an agreement between the DPC and the Library of Congress. A poster session on the Digital Curation Centre gave delegates the opportunities to ask specific questions before the afternoon presentations.

The afternoon session began with Tony Hey, Director of the e-Science Core programme. Tony's presentation was 'e-Science - preserving the data deluge' (PDF 543KB) . The e-science grid (or cyberinfrastructure as it's known in the U.S) has the vision enunciated by Licklider, of being able to bring together all material throughout the world and build a truly global, collaborative environment which enabled researchers to work together regardless of geographic location. Describing the impetus for the development of the DCC, Dr Hey said that over the next 5 years, e-science will produce more scientific data than has been collected in the whole of human history. The goal is to bring together the digital library community with the scientific community so that each can learn from the other.

Peter Burnhill, Interim Director of the DCC, described 'The Digital Curation Centre' (PDF 146KB)which has received funding of £1.3m p.a. from JISC and the e-Science Core programme. The DCC was not a digital repository, he said, but would provide services and research for the community involved in digital preservation. It is still very early days, in that the DCC has only been operational for a few months but progress has been made. A website has been launched, an e-journal is planned and focus groups would help to articulate who the user community for the DCC is and what their needs are. It was anticipated that a permanent Director would be in post by the official launch, scheduled for early November.

Nancy McGovern spoke of 'The Cornell Digital Preservation Online Tutorial and Workshop (PDF 419KB). This is yet another illustration of the pressing need to develop practical support for those already involved in, or about to embark on, digital preservation programmes. It was also another example of the strength of collaboration, as the curriculum had been developed collaboratively and Cornell looked forward to working closely with the DPC, who have been inspired by Cornell's work to develop a similar programme geared towards the U.K. Nancy described the five organisational stages of digital preservation which are: Acknowledge; Act; Consolidate; Institutionalise; Externalise. Nancy noted that none of these stages can be skipped and it was essential to realise that there is no on/off switch for digital preservation, it is something which needs to build over time. Cornell has now run four workshops which have received very positive feedback from participants. All have been oversubscribed, which illustrates the need for intensive training which provides a toolkit to enable participants to take practical short-term strategies appropriate to their own institutional settings. A fifth workshop is planned for November 2004.

The final session of the day provided an opportunity to hear two very different approaches to preserving e-journals. Vicky Reich described the 'LOCKSS Program approach' (PDF 804KB), applicable to any content available through http protocol, and which enables libraries to collect and preserve content in the same way as they do for print. Vicky stressed that LOCKSS preserves the content, not the services publishers provide (e.g. search buttons). LOCKSS has established contact with several publishers and it is essential to have the cooperation of publishers to allow LOCKSS crawlers to gather their content. Trust was also an issue here - publishers needed to trust that libraries would gather content they have purchased under licence. Key advantages of LOCKSS are its inbuilt redundancy and ease and cheapness to install. Vicky stressed that some institutions needed to have large, central repositories as well but this need not preclude the use of LOCKSS.

The final speaker of the day was Eileen Fenton, on 'Preserving e-journals, the JSTOR model' (PDF 58KB). The Electronic Archiving Initiative has involved working with publishers and is focused on preserving the source files. Archiving e-journals requires a significant investment in the development of organisational and technological infrastructure, it was not either/or. Eileen also described Ithaka, a not-for-profit company, supported by Mellon, Hewlett and Niarchos funding. This has the goal of filling gaps not being supplied by the free market. Both Eileen and Vicky agreed that at this nascent stage of development, the community needs multiple approaches.

In closing the Forum, Lynne Brindley thanked all of the speakers for the significant contribution that had made to the success of the Forum. The next DPC Forum will be a joint DPC/CURL event and will be held on Tuesday 19 October 2004. Further details will be available in the coming months.

Read More

Archives: adapting to the digital age

Report on the DPC Forum, Archives: adapting to the digital age

Held at the National Archives, Kew, Wednesday 24th September 2003

Around 40 participants attended the 7th DPC Forum, which was held at the National Archives, Kew. The Forum was timed to coincide with Archives Awareness month so it was appropriately held at TNA and focussed on archives in the digital age. It also coincided with the anticipated Autumn internet launch of TNA's PRONOM database (http://www.nationalarchives.gov.uk/preservation/
webarchive/default.htm/pronom/documentation.htm

Update 26 September 2007
This link no longer active; information on PRONOM can be found at
http://www.nationalarchives.gov.uk/aboutapps/pronom/ )
and the UK Central Government Web Archive
(http://www.nationalarchives.gov.uk/preservation/webarchive

Update 03 October 2007
New location:
http://www.nationalarchives.gov.uk/preservation/archivedwebsites.htm
).  Demonstrations of both of these were provided to participants in the afternoon sessions.

David Thomas, Director, Government and Archive Services at TNA, chaired the Forum and in his welcome and introduction, noted that TNA was in a process of change but now has "real stuff" to show, as opposed to abstract discussions. The following sessions, which preceded demonstrations of PRONOM, the Digital Archives, and tours of TNA, were informative, thoughtful, and stimulated lively discussion.

Session 1: Electronic Records Management

Richard Blake, Head of the TNA's Records Management Advisory Service, placed ERM in a strategic framework. He stressed that this was an issue affecting any business, not just archives, if material is to be held for more than five years, preservation issues will inevitably arise. It is necessary to ensure that records are kept useable, whether they are being kept for twenty years or taken into an archive for permanent retention.

He referred to BS ISO 15489, which is the first international standard for records management. There were however some problems in applying this standard in practice as it doesn't define in enough detail what each of the four key characteristics (authenticity; reliability; integrity; usability) is. The presentation looked at issues associated with each of these four characteristics and the major theme running through each of these was the need to ensure that records management systems function within a strong intellectual framework that articulates such details as, for example, what additions and annotations are permissible. It needs to go beyond "just buying the technology" in order to ensure that the authenticity, reliability, and integrity of electronic records are not to be challenged in future. The TNA provides a guidance role for and most of these are available from the TNA website. Finally, Richard noted that in such a new and rapidly evolving area, we have to accept that mistakes will be made but we should at least be able to understand why we made them.

Stuart Orr, Assistant Director in the Information and Workplace strategies Directorate of DTI provided a useful case study of how ERM was implemented in DTI. The problem was that there had been a practice of increasing devolvement in government departments during the previous government. This had led to difficulties in sharing information and storing it in non standard ways. The Secretary of State for DTI, Patricia Hewitt, recognised the need for improved means of sharing and storing information so that a better service could be provided for those seeking information from DTI. The Matrix project was developed to address this problem and was rolled out across 22 sites within the U.K, with c. 5,000 users. The presentation described what Matrix will and will not provide, for example it will be expected to support collaborative working but it will not introduce the paperless office. It will also not work unless there is investment of time and effort, people need to input quality information and this is difficult to control in a devolved environment. Around 60 people worked on the Matrix project, including a full-time communications manager. A step-by-step approach was taken, beginning in May 2000, with plan and prototype, leading on to testing, trailing, and finally leading to rollout in May 2002. Bringing staff on board and providing training were seen as key elements and Stuart said that they could have invested even more in training. In terms of long-term preservation, DTI still has a lot of questions. They are looking to TNA for advice and are conscious of the need for caution before investing in preservation infrastructure.

Session 2: Collecting and preserving digital materials

Kevin Schurer, Director of the UK Data Archive and the recently established Economic and Social Data Service provided a fascinating historical overview of the first 35 years of the UK Data Archive. Kevin noted that the UKDA is not a legal repository in the sense that TNA is but their service goes well beyond data delivery so there are synergies between the two. There are many changes that have occurred in the 35 years since the UKDA was established. The material has diversified so that not only survey data, but, for example, sound recordings and pictures are now included in data collections. The number of users has increased greatly and has doubled over the past few years. Formats have changed, in particular since the mid 1990's. Until then, magnetic tape was the dominant input and dissemination medium. CD-ROM's and web delivery have now become much more prevalent. There were still older forms, such as punched cards in the collection and a new punch card reader was purchased recently (though it had been difficult to locate!). While preservation was not seen as an issue when the UKDA was being set up in the 1960's, it has become an issue because of the emphasis on providing research material to the academic sector. This inevitably leads to preservation issues needing to be addressed in order to keep the material useable. The Data Exchange Initiative was seen as a potential bridge between the need to push material out in user friendly formats, while still retaining data in XML, which makes preservation simpler to manage. The XML schema will commence in early 2004 and will be a two year project. While the UKDA was working with a limited sub-set of digital information, it must still deal with most of the challenges which legal archives need to address, because of their remit to provide access to research materials. Kevin provided copies of the UKDA's preservation policy to participants [Note: this will be available from the member's pages on the DPC website in the near future].

David Ryan, Head of Archive Services at TNA gave the final presentation of the morning. David described the TNA's work on web archiving. There were two broad types of approach to web archiving, selective and harvesting, and David outlined the pros and cons of each before describing the approach being taken by TNA. This was to evaluate a number of technical approaches, develop a selection policy for websites, work with government departments to develop guidance, and develop long-term preservation strategies. The Modernising Government white paper provided the impetus for using the web as a communication mechanism with its target of all Government services being available online by 2005. The spin-off benefit of this is that preservation and presentation are brought together, people can see both the benefits and the limitations of what current technology can provide. Issues include the size of the domain, estimated at c.2,500, though this is difficult to track as not all are called .gov.uk; increasingly dynamic (and therefore more complicated) content; copyright; and legal deposit.

To open PDFs you will need Adobe Reader

Programme 

9.30  Registration and Coffee
10.00  Introduction and welcome, David Thomas
  Session 1 - Electronic Records Management - Chair - David Thomas
10.15 Electronic Records Management - the role of TNA (PDF 191KB)
Richard Blake
10.45 Introducing ERM at DTI (PDF 1.6MB)
Stuart Orr
11.15 Short break
  Session 2 - Collecting and preserving digital materials - Chair - David Thomas
11.30 The UK Data Archive and the Experience of Digital Preservation (PDF 311KB)
Kevin Schurer
12.00 Collecting Government websites at TNA (PDF 1.3MB)
David Ryan
12.30 Lunch
13.30 Stream 1:  Demonstrations of Digital Archive and PRONOM
by Adrian Brown and Jo Pettitt
  Stream 2:  Tour of TNA led by Kelvin Smith
14.30 Short Break
14.40 Stream 1:  Tour of TNA led by Kelvin Smith
  Stream 2: Demonstrations of the Digital Archive and PRONOM
  By Adrian Brown and Jo Pettitt
15.40 Discussion and final wrap-up
16.00 Close
Read More

Open Source and Dynamic Databases

Programme

In keeping with using the Forums as a means to keep participants up to date with the latest developments in digital preservation, the 6th DPC Forum focussed on Open Source Software and Dynamic Databases, both of which have been the subject of debate and speculation. A stimulating and thought provoking day began with four presentations on OSS.

Alan Robiette provided a comprehensive, historical overview of the development of OSS and its pros and cons. A key drawback (lack of support) was echoed in other presentations. The early stages of the MIT-Cambridge DSpace collaboration were described in Julie Walker and Anne Murray's presentation. William Nixon discussed the issues in building a network of institutional repositories as part of the DAEDALUS project at the University of Glasgow.

Jo Pettit described the work of the National Archives of their trials and pilots programme in addressing some of the practical issues they are facing. Open source software testing is a part of this programme. Demonstrations of OCLC'S Digital Archive and DSpace were provided during the lunch break.

The afternoon session had three presentations on archiving dynamic databases. Peter Bunemann's presentation was on archiving scientific data and referred to the tension between preserving scientific data frequently, which is space consuming, and infrequently, resulting in delays. Experimental work being undertaken on data structures offers some promise for affordable, persistent scientific archives.

Bryan Lawrence described the role of the British Atmospheric Data Centre, NERC's designated centre for atmospheric data and referred to the influence of the OASI model in recognising that data storage is part of a wider picture of consumers and producers, with data repositories acting as facilitators between the two. Finally Cathy Smith provided a lively update on the evolution of the BBS website and issues involved in archiving it.

9.30 - 10.00 

Registration and Coffee

10.00 - 10.10

Welcome and Introduction

 

Session 1 - Open Source and Digital Preservation

   

10.10 - 10.40 

Open Source & Commercial Software (PDF 194KB)
Alan Robiette, Programme Director, JISC

10.40 - 11.10

The DSpace at Cambridge Project (PDF 547KB)
Anne Murray, Cambridge University and Julie Walker, MIT

11.10 - 11.40

Experiences With E-prints and Dspace (PDF 943KB)
William Nixon, Glasgow University

11.40 - 12.10

The Open Source Evaluation Project at the National Archives (PDF 1.9MB)
Jo Pettitt, National Archives

   

12.10 - 12.30 

Discussion

   

12.30 - 2.00

Lunch and Demonstrations
(DSpace and OCLC Digital Archive)

 

Session 2 - Approaches to Archiving Dynamic Databases

   

2.00 - 2.30

Archiving Dynamic Databases (PDF 436KB)
Professor Peter Buneman, Edinburgh University

2.30 - 3.00

Experiences with Archiving Databases in BADC (PDF 2.42MB)
Bryan Lawrence, British Atmospheric Data Centre

   

3.00 - 3.30

Coffee

   

3.30 - 4.00

Ten Years on the Web: Archiving BBCi Online (PDF 632KB)
Cathy Smith, BBC

   

4.00 - 4.30

Concluding Discussion

Read More

Infrastructure and Development

Programme

9.30 - 10.00

Registration and Coffee

   

10.00 - 10.10

Welcome and Introduction
Gerry Slater, Chief Executive PRONI, Member of DPC Board of Directors

   
Session 1

Developing Infrastructure for Digital Preservation in the UK

   

10.10 - 10.40

The DPC Business Plan and the Proposed UK Needs Survey (PDF 105KB)
Duncan Simpson, Consultant, Responsible for preparation of the DPC Business Plan

10.40 - 11.10

The Digital Curation Centre (PDF 149KB)
Neil Beagrie, Programme Director, JISC

11.10 - 11.40

File Format Registries (PDF 126KB)
Derek Sergeant, Leeds University

11.40 - 12.10

The Needs of E-Science: Report Recommendations (PDF 407KB)
Phillip Lord, Director and Alison Macdonald, Senior Consultant
The Digital Archiving Consultancy

12.10 - 12.30

Discussion

   

12.30 - 1.30

Lunch

   
Session 2

Current Research and Developments

   

1.30 - 2.00

Conservation and Management of Digital Works of Art (PDF 1.08MB)
Pip Laurenson, Sculpture Conservator for Electronic Media, Tate

2.00 - 2.30

The AHDS Preservation Review (PDF 143KB)
Sheila Anderson, Director and Hamish James, Collections Manager AHDS

2.30 - 3.00

Archiving Subscription E-Journals (PDF 129KB)
Maggie Jones, JISC

   

3.00 - 3.30

Coffee

   

3.30 - 4.00

Report from DPC Web-archiving Special Interest Group (PDF 116KB)
Deb Woodyard, Digital Preservation Co-ordinator, The British Library

   

4.00 - 4.30

Concluding Discussion

Read More

Preservation of e-Learning Materials and Cost Models for Digital Preservation

Presentations

This was the fourth forum held by the Digital Preservation Coalition. The Forums aim to share experience from leading projects with DPC members organisations and to address topical issues in digital preservation. With the growth of distance learning and an increasing effort to digitise collections and package them for use in education in school, colleges, and universities, developing digital respositories and standards for managing learning objects are growing issues. The first session of the day focused on repositories and preservation of e-learning and leading initiatives in the field. The second session in the afternoon addressed cost models for digital preservation and speakers from industry and the public sector covered approaches and different aspects of digital preservation costs.

NickWainwright

Nick Wainwright from Hewlett Packard  

The first presentation was given by MacKenzie Smith (MIT) and Nick Wainwright (Hewlett Packard) on Dspace (PDF 1.08MB). Dspace is an emerging repository software developed at MIT which will be open source, federated and aims to provide a preservation archive. It aims to offer: large scale, stable, managed long-term storage; support for a range of digital formats; easy to use submission procedures, persistent network identifiers, access control, digital preservation services. Amongst the many other interesting things said was the distinction MIT makes between known supported data types (e.g. TIFF, SGML/XML, PDF); known unsupported data types (e.g. Microsoft Word, Powerpoint); and unknown/unsupported data types (e.g. a one-off computer program). Reference was made to discussion within the Digital Library Federation in the USA on establishing Digital Format Registries which would capture format documentation such as specifications at a more granular level than MIME (e.g. TIFF 5.0, not just TIFF).

The next presentation was by Lorna Campbell (Centre for Educational Interoperability Standards (CETIS)) on Learning Technology Standards and Digital Repository Standards (PDF 87KB). CETIS is supported by JISC to provide support to UK Higher and Further education. CETIS has a number of special interest groups on accessibility, assessment, educational content, learning information packages, metadata and an FE focus Group. Learning technology standards and specifications are designed to facilitate the description, packaging, sequencing, and delivery of educational content, learning activities and learner information. They are needed to facilitate interoperability, to prevent content being 'locked into' proprietary systems, to ensure that educational content is durable and reusable, and to enable the sharing of content and learner information. Future considerations include: registries and directories; digital rights management; location and resolution services; request and deliver services; web services.

BruceRoyan

Prof. Bruce Royan of SCRAN

The last presentation before lunch was by Prof. Bruce Royan of SCRAN (Scottish Cultural Research Access Network) on the subject of eLearning and the Business Case for Digital Libraries (PDF 3.08MB). An important part of his presentation was on the topic of sustainability both from a technical point of view and a financial point of view. He discussed several business models that could be attempted to attain a sustainable service. He pointed out the dangers of relying solely upon the strategy of obtaining grants and emphasised the need to develop revenue funding. He then went on to describe the licensing model developed in SCRAN and the use it makes of authorisation, authentication, watermarking and fingerprinting. He went on to describe the RAID conception of what digital objects should be: Re-usable; Accessible; Interoperable; and Durable. His conclusions were: digital preservation implies significant ongoing costs and must be underpinned by strong business models; the most likely source of funding for these costs is educational licensing; licensed resource services must meet the evolving needs of their customers; digital library services like SCRAN must adopt to the standards and frameworks of Learning Object Repositories.

The first presentation after lunch was by Alison MacDonald (Secure Sciences Ltd) whose presentation was entitled: Bits & Bobs, DPC Presentation (PDF 1.04MB). She began by discussing the nature of costs and the problem of identifying costs in digital preservation. She found the OAIS model useful in identifying these costs. She pointed out that digital preservation id a sub-activity - it implies purpose behind retention. Economies of scale imply one or more real or virtual archives. She went on the describe the complexity, the components, the variables and the cost drivers of digital preservation costs. She raised the question of where the costs of digital preservation fall and what the accounting problems are. The need for activity based costing was explored. One of her conclusions was that, "An understanding of costs in digital preservation is an important factor in managing and maintaining maximum funding for digital archives."

MegBellinger

Meg Bellinger from OCLC 

The next presentation was by Meg Bellinger (OCLC) and was entitled Cost and Business Models for digital preservation (PDF 85KB): developing digital lifecycle management services at OCLC. After describing different business models and sustainability models, she went on to describe the nature of OCLC and its business model. She saw current issues as being: certification of digital archives; the significant attributes of objects that must be preserved; models for cooperative repository networks and services; systems for the persistent identification of digital objects; intellectual property rights; technical strategies for continuing access; minimal level metadata required for long-term management and tools to automate extraction; economic sustainability. The development costs for the OCLC Digital Archive had been $3.1 million dollars (approximately £2 million pounds). She identified the unknown costs of digital preservation as being: managing technological changes over time; the proliferation of data types; the lack of standardization of data types; and the problem of defining what is essential.

HelenShenton

Helen Shenton from British Library

The final presentation of the day was by Helen Shenton (British Library) entitled Developing Life Cycle Models of the British Library - work in progress (PDF 168KB).
She described the objectives of this work as being to: establish the optimum appointment of resources between phases of the life-cycle of the BL collections (traditional and digital) now and define the impact for the future; to identify, document and if possible, benchmark the collection service and interdependencies between each stage of the life cycle; to produce a report, recommendations and implementation strategy for a policy and economic framework by April 2003. The means to realise these aims were to: examine an earlier traditional model to see if it could be adopted; to examine if the traditional model could be adopted for the digital. The objective was to combine traditional and digital models to reflect all of BL's collections. An analysis of activity based costing for collection management and digitisation projects was presented.

Before a general discussion Neil Beagrie described the JISC digital preservation strategy for 2002-5. There were a number of elements directly relevant to the forum including the establishment of a Digital Curation Centre, and feasibility studies on the preservation of e-learning objects and e-prints, which would be announced shortly.

Overall the meeting illustrated well the fact that the most intractable problems for digital preservation are political and economic. What does a sustainable cost model for digital preservation look like? How can a business plan be made given the proliferation of data formats and the lack of standards? How can institutions such as the British Library respond to an increasing work load with a fixed budget? But the meeting also demonstrated that the best response to these problems is creativity and hard work.

Programme

15th October, 2002. Prospect House, York Road, London, SE1 7AW

09.30 - 10.00 Registration and coffee

10.00 - 10.10 Welcome and Introduction

Session 1

10.10 - 11.00 D-Space - Mckensie Smith (MIT) & Nick Wainwright (Hewlett Packard)

11.00 - 11.30 Standards and Digital Repositories - Bill Oliver & Lorna Campbell (Centre for
Educational Technology Interoperability Standards)

11.30 - 12.00 E-Learning and the Business Case for Digital Libraries: A Case Study from
SCRAN - Bruce Royan (SCRAN)

12.00 - 12.30 Panel questions and discussion.

12.30 - 13.45 Lunch

Session 2

13.45 - 14.30 Bits and bobs: Digital Preservation and Costs - Alison MacDonald (Secure
Sciences Ltd)

14.30 - 15.00 Cost and Business Models for Digital Preservation: Developing Digital
Lifecycle Management Services at OCLC - Meg Bellinger (OCLC)

15.00 - 15.30 Developing Cost Models at the British Library - Helen Shenton (British
Library)

15.30 - 16.00 Tea/Coffee

16.00 - 16.30 Closing Discussion

16.30 - Close of Forum
 

Presentations

Read More

Digital File Longevity (compiled for R&D in Digital Asset Preservation)

Resources compiled by Julian Jackson

This subject is relatively new and there are not many sources dealing with the longevity of digital image files. The sites below list many a variety of sites which will be helpful in finding out more. Howard Besser of UCLA, in particular, has written much on the subject and compiled links to other useful resources

We know that photographic negatives, transparencies and prints last a long time. They are reliable forms of storing data. Recently the Royal Geographic Society reprinted Frank Hurley's pictures from the 1913 Antarctic Exhibition - from his original glass negatives, nearly 100 years old. An example of how robust the storage medium was - remember these negatives had been in sub-zero conditions and transported across an ocean in a tiny lifeboat!

In the headlong rush to put photographic images into digital form, little thought has been given to the problem of the longevity of digital files. There is an assumption that they will be lasting, but that is under question.

"There is growing realisation that this investment and future access to digital resources, are threatened by technology obsolescence and to a lesser degree by the fragility of digital media. The rate of change in computing technologies is such that information can be rendered inaccessible within a decade. Preservation is therefore a more immediate issue for digital than for traditional resources. Digital resources will not survive or remain accessible by accident: pro-active preservation is needed." Joint Information Systems Committee: Why Digital Preservation?

The 1086 Domesday Book, instigated by William the Conqueror, is still intact and available to be read by qualified researchers in the Public Record Office. In 1986 the BBC created a new Domesday Book about the state of the nation, costing £2.5 million. It is now unreadable. It contained 25,000 maps, 50,000 pictures, 60 minutes of footage, and millions of words, but it was made on special disks which could only be read in the BBC micro computer. There are only a few of these left in existence, and most of them don't work. This Domesday Book Mark 2 lasted less than 16 years.

Digital media have to be stored, and the physical medium they are stored on, for instance a computer's hard disk drive or a CD-rom have finite lifespans. But the primary problem is of obsolescence. Computer formats sink into oblivion very rapidly. Howard Besser, of the UCLA School of Education & Information Studies says: "Fifteen years ago Wordstar had (by far) the largest market penetration of any word processing program. But few people today can read any of the many millions of Wordstar files, even when those have been transferred onto contemporary computer hard disks. Even today's popular word processing applications (such as Microsoft Word) typically cannot view files created any further back than two previous versions of the same application (and sometimes these still lose important formatting). Image and multimedia formats, lacking an underlying basis of ascii text, pose much greater obsolescence problems, as each format chooses to code image, sound, or control (synching) representation in a different way."

If an image has been generated on negative or transparency, then scanned and transformed into a digital file, then the original is safe. However if it has been digitally originated, such as much of today's news and sport photography, then vital parts of our cultural heritage may be lost forever. This problem will get worse as more photography becomes completely digital.

The two aspects of the problem

The longevity problem can be divided into two questions the lifespan of the medium on which the file is stored, e.g. a CD-rom, and the obsolescence of the format: digital formats age quite rapidly because they are superseded by new formats, particularly if they are proprietary ones.

As the British Joint Information Systems Committee says: "Preservation is therefore a more immediate issue for digital than for traditional resources. Digital resources will not survive or remain accessible by accident: pro-active preservation is needed"

The key technical approaches for keeping digital information alive over time were first outlined in a 1996 report to the US Commission on Preservation and Access (Task Force 1996).

  • Refreshing involves periodically moving a file from one physical storage medium to another to avoid the physical decay or the obsolescence of that medium. Because physical storage devices (even CD-roms) decay, and because technological changes make older storage devices (such as 8 inch floppy drives) inaccessible to new computers, some ongoing form of refreshing is likely to be necessary for many years to come.
  • Migration is an approach that involves periodically moving files from one file encoding format to another that is useable in a more modern computing environment. (An example would be moving a Wordstar file to WordPerfect, then to Word 3.0, then to Word 5.0, then to Word 97.) In a photographic environment we come across older Photoshop files that are no longer readable and had to update them into a new format. Migration seeks to limit the problem of files encoded in a wide variety of file formats that have existed over time by gradually bringing all former formats into a limited number of contemporary formats.
  • Emulation seeks to solve a similar problem that migration addresses, but its approach is to focus on the applications software rather than on the files containing information. Emulation backers want to build software that mimics every type of application that has ever been written for every type of file format, and make them run on whatever the current computing environment is. (So, with the proper emulators, applications like Wordstar and Word 3.0 could effectively run on today's machines.)

Both a migration approach and an emulation approach require refreshing.

This places a burden on individual photographers and small photolibraries, who have enough to contend with, with the rapid changing of their environment. That costly digital files might be unusable in a few years is a worrying thought. While TIFFs and JPEGs - because of their wide acceptance - are likely to be more resistant to becoming obsolete, nevertheless this will probably happen. Users of images need to be aware of this and have a plan to refresh the data onto new formats if necessary. This necessitates having good back-up copies to work from.

The role of meta-data

It has become clear to many institutions that there should be world-wide standards of data embedded in every file: who created it, when, what format, captioning and copyright information, for example. This would make access easier and also help preservation in the future. Although there are various institutions fumbling towards standards, of course creating one universal one will not be easy. A valuable project is the Dublin Core Metadata Initiative which is having workshops and projects to create various metadata standards.

http://dublincore.org/

The decay of physical media

Photographic materials tend to decay slowly over time, so you have enough warning to copy a treasured print, for example. Digital media tend to fully function, or not, and you have to open the file to find out. This adds another layer of uncertainty to the process. While this is the lesser of the two problems, it still has to be thought about. The lives of hard drives or CD-RWs is somewhat speculative. In the case of the latter, accelerated lifespan tests have been done, but we still have incomplete data, as the medium is quite new. It would seem wise to backup vital data on two different media, for instance a hard drive and a CD-RW until more is known.

This is a complex problem. Howard Besser at UCLA seems to be one of the best sources for further information:

Information Longevity http://sunsite.berkeley.edu/Longevity

Besser, Howard. Digital longevity,

Besser, Howard. Longevity of Electronic Art,

Task Force on the Archiving of Digital Information

Other sources:

Journal of Electronic Publishing: http://www.press.umich.edu/jep/

Joint Information Systems Committee http://www.jisc.ac.uk.

Sepia (European group investigating preservation of photos) http://www.knaw.nl

Resource Links:

Papers and Analysis of Problems

Other sources:

Journal of Electronic Publishing: http://www.press.umich.edu/jep/
Joint Information Systems Committee http://www.jisc.ac.uk.
Sepia (European group investigating preservation of photos) http://www.knaw.nl

PRESERVATION MANAGEMENT OF DIGITAL MATERIALS: A HANDBOOK
by Maggie Jones and Neil Beagrie

Published by THE BRITISH LIBRARY October 2001
Price £15.00 Paperback, 145 pages, 297x210mm, ISBN 0 7123 0886 5

Julian Jackson is an internet consultant and writer, specialising in the Photographic industry. His website is www.julianjackson.co.uk. He publishes two essential eBooks: Picture Research in a Digital Age 2 and Internet Marketing for Photographers, which are available from his website:http://www.julianjackson.co.uk

Read More

Future R&D for Digital Asset Preservation

DPC Forum with Industry
5th June, 2002. Prospect House York Road London SE1 7AW

Since its inception the DPC has aimed to gain industry awareness of its key messages and of the future needs and opportunities that lie ahead. This forum is part of that process. During the day representatives from the private and public sector will be speaking. They will address long-term trends and the research and development issues involved in the implementation of continuing access and preservation strategies by industry and government. Issues covered will include preserving TV and broadcast archives and research and development in the public and private sector.

Meeting Report

Introducing the first Digital Preservation Coalition (DPC) Forum focussed on developing a dialogue with Industry, DPC Secretary Neil Beagrie welcomed guests and members of the DPC. He then placed the question of digital preservation and opportunities for industry participants, firmly within an international context. The US National Digital Information Infrastructure and Preservation Program (NDIIPP): a $175m national programme; the National Science Foundation Cyberinfrastructure initiative; the Information Society Technologies programme of the EU 6th Framework; and, in the UK, the Research Grid and the work of the DPC, will be central to bringing political, technical and organisational impetus to bear on the challenges of digital preservation. With accelerating development of digital content, the issue of  maintaining long-term access will be of concern to an increasing ranging of sectors and individuals.

Forging strategic alliances with industry in the context of preservation was a vital component of any initiative in this area, said Neil Beagrie. He outlined how the DPC's constitution was shaping its emerging links with industry. Two important principles governed the DPC's work: firstly, the DPC will support the development of standards and generic approaches to digital preservation, which can be implemented by a range of hardware, software and service vendors; in short, he continued, the goals of the DPC are vendor-neutral. The second principle is that the DPC is a coalition of not-for-profit organisations including industry associations; it is committed to promoting and disseminating information so that all can learn from the transferable lessons and outcomes. The DPC is actively interested in broadening its industry links including those to individual companies. There is a major potential role for industry, concluded Mr Beagrie; this event therefore marked an important step in realising that potential.

Philip Lord, formerly of SmithKlineGlaxo, and now a Digital Archiving consultant, spoke of his own experience in industry and of some of the major drivers for industry in the field of digital preservation. He particularly emphasised the importance of the US Federal Drug Agency (FDA)'s regulation 21CFR Part 11 on the maintenance of good electronic records for the pharmaceutical industry. Legal and regulatory issues were, he said, clearly of major importance, as were the various voluntary drivers, such as contractual and IP obligations, operational efficiency considerations and the need to preserve for future reuse. Rarely were records preserved, he said, for historical or sentimental reasons.

Mr Lord talked of the many challenges that faced industry in this area: the heterogeneity of data sources and systems, geographical dispersion (spanning legal and regulatory jurisdictions), the lack of suitable preservation systems and services, management issues, cost, and the lack of expertise in this area. Mr Lord reported that little progress had been made so far within the sector, although a few companies, he reported, were leading the way.

David Ryan from the Public Records Office reported that digital preservation was a core part of the PRO's e-business strategy. Research and development in this area was focussing, he said, on proprietary format migration, emulations and simulation, open format export and migration and product reviews. The importance to users of interaction with preserved digital information, rather than simply using that information as an historical record (as is more likely the case with printed data) was made, and this gave an added importance to the work of the PRO in its e-preservation activities. Mr Ryan concluded by saying that there was an overriding need for the archives community to establish credibility with the key stakeholders in e-preservation, including industry, the media and the general public. The community as a whole needed to prioritise so that the most urgent tasks and challenges were tackled first.

The film and sound archives of the BBC contain some 1.75m film and videotape items and around 800,000 radio recordings from the late 1940s onwards. A ten-year preservation strategy had, Adrian Williams from the BBC reported, recently been approved and would cost around £60m. A key part of this strategy was a programme of digitisation for both access and preservation. He reported on the European Commission Presto project, which involved 10 partners, lasted 24 months and cost around 4.8 million Euros. Findings from this project suggested, for example, that digitisation and mass storage is about 50% more expensive, but is expected to double usage of an asset; and moreover, that the value of an item must be four times the preservation cost to be financially viable. He concluded by suggesting that Europe requires a "dedicated preservation factory" given the scale of the task facing national broadcast archives. There was substantial audience interest in the approaches to cost and business modelling described in the Presto project.

Julian Jackson noted in the headlong rush to put photographic images into digital form, little thought seemed to have been given to the problem of the longevity of digital files.  There is an assumption that they will be lasting, but that is now under question. He addressed general issues surrounding preservation and obsolescence in digital images. He surveyed the techniques of refreshing, migration and emulation and emphasised the crucial role that metadata and metadata standards have to play in these preservation processes.

Paul Wheatley of the CAMiLEON project spoke of some of the practicalities of digital preservation and emphasised the need for long-term strategies.Existing methods have many drawbacks.  Mr Wheatley described advanced techniques of data migration which can be used to support preservation more accurately and cost effectively.

To ensure that preserve works can be rendered on computer systems over time "traditional migration" has been used to convert data into current formats.  As the existing format becomes obsolete, another conversion is performed, etc.  Traditional migration has many inherent problems as errors during transformation propagate through future transformations.  Mr Wheatley described how the Camileon project had developed new approaches to extending software longevity ("C--") which had been applied in experiments and demonstrated improvements over traditional migration.  This new approach is named "Migration on Request".

Migration on requesting shifts the burden of preservation onto a single tool, which is maintained overtime.  Always returning to the original format enables potential errors to be significantly reduced.  Mr Wheatley also described how preservation quality emulators were being produced and strategies of migration on request and/or emulation were being applied.

The need for public-private partnerships in the field of digital preservation is crucial, said David Bowen of Audata Ltd. He went on to outline what industry was currently doing in this field -  - e-mail, document and database migration, as well as promoting standards, while software suppliers are also improving backward compatibility (Word, Wordperfect, RTF), and increasingly adopting and promoting standards themselves too. Mr Bowen called for R & D partnerships, like the Testbed Digitale Bewaring in the Netherlands, which is leading to the sharing of results and advice, and sound record creation and metadata practices. Particularly important, he concluded, was the need for software suppliers to be brought into the growing public-private partnerships that are developing.

The final session of the day was a discussion session. Key themes that emerged included:

  • the importance of archiving software and technical documentation.  It was felt by participants from all sectors that this is a major gap and there was an urgent need to develop appropriate repositories;
  • the need to develop case studies and tools for modelling costs.  It was felt this is a major area that should be covered in future DPC forum;
  • the necessity of developing national funding for the preservation of intangible heritage assets.  It was noted there is no "Superfund" or legislation which allows  digital heritage to be gifted in lieu of tax to or purchased by, the nation;
  • further work by the Digital Preservation Coalition to establish contacts with industry and to build on the dialogue commenced at the forum.

It was felt that it was important, as David Bowen had said, to include software and hardware suppliers in future developments as their actions could be crucial, in particular in providing the tools and products for end-to-end solutions which were needed in this area. Once again the importance of using both migration and emulation strategies was emphasised, as was the question of the criteria for choosing what should be preserved; we are not in a position to judge easily what will be in demand in the future. Therefore sampling could be crucial importance as one strand in our overall strategy.

Some delegates from industry felt that there were gaps of responsibility between the organisations, and that it was therefore important for the DPC to coordinate and facilitate activities in this area.

The forum ended on a note of optimism that the first steps in the dialogue with industry had been taken, and with a number of concrete action points which Lynn Brindley, Chair of the DPC, promised would be followed up in the coming months.

End of Meeting Report

Programme and Presentations

10.30 - 11.00  Registration and coffee
   
11.00 - 11.10 Welcome and Introduction
11.10 - 11.35 Keynote Address - "Trends and Future Opportunities" (PDF 92KB) Neil Beagrie JISC
11.35 - 12.00 Preserving digital records in Industry (PDF 258KB) Philip Lord ex GlaxoSmithKline
12.00 - 12.30 Preserving digital records in Government (PDF 71KB) David Ryan Public Records Office
   
12.30 - 1.30 Lunch
   
1.30 - 1.55 Preserving TV and Broadcast Archives (PDF 449KB) Adrian Williams BBC
1.55 - 2.15 Preserving Digital and Historic Images Julian Jackson Internet Consultant and Writer Picture Research Association;
Digital File Longevity
2.15 - 2.35 The Camileon and Cedars Research projects (PDF 500KB) Paul Wheatley Leeds University
   
2.35 - 3.05 Coffee
   
3.05 - 3.30 Practical Experiences of Preservation: R&D partnerships in the private and public sector (PDF 397KB) David Bowen Audata Ltd
3.30 - 4.10 Discussion
4.10 - 4.30 Concluding Address - Lynne Brindley Chief Executive British Library
Read More

Web-archiving: managing and archiving online documents and records

Web sites are an increasingly important part of each institution's digital assets and of this country's information and cultural heritage. This event, organised by the Digital Preservation Coalition (DPC), brought together key organisations in the field of web archiving in order to assess the needs of organisations to archive their own and others' web sites, to highlight good practice, and to influence the wider debate about digital preservation.

Meeting Report

This meeting report provides a short summary of the DPC Members Forum on web archiving held on 25th March 2002. Individual PowerPoint presentations from each of the speakers are available below.

Web sites are an increasingly important part of each institution's digital assets and of this country's information and cultural heritage. As such, the question of their management and archiving is an issue which UK organisations need to be increasingly aware of. This event, organised by the newly-created Digital Preservation Coalition (DPC), brought together key organisations in the field of web archiving in order to assess the needs of organisations to archive their own and others' web sites, to highlight good practice, and to influence the wider debate about digital preservation.

Neil Beagrie, Programme Director for digital preservation at JISC and Secretary of the DPC, began the day's proceedings by welcoming delegates to the event, the first event on web archiving to be organised by the DPC. He stressed the importance of the issue.

The first speaker, Catherine Redfern from the Public Record Office (PRO) provided a short general introduction to web-archiving. Web sites are records, and as such, need to be managed and archived. However selection was necessary too, said Ms Redfern. But what are the criteria to be employed in such a process of selection? And how important is the capturing of the 'experience' of using the web site given that the look and feel of a site are an intrinsic part of the record. It was important, concluded Ms Redfern, to accept that perfect solutions do not exist, and that flexibility means that it may be the case that different solutions existed for different web sites.

Brian Kelly of UKOLN followed and covered the size of the UK web domain and the nature of UK websites. He emphasised the sheer scale of the challenge by looking at definitions and measurements of the UK web space. A number of different approaches by organisations came up with different measurements, but a figure of 3 million public web servers which contained .uk within their URLs was given by Netcraft. In 2001 OCLC's Web Characterization Project suggested the UK accounted for 3% of the sites on the WWW. Searches using AltaVista further suggested that UK websites might contain around 25 million pages. Preserving web sites which we are unable to count will prove particularly difficult, he said, but perhaps the most important question was: at what rate is the UK web space growing?

Brian Kelly then went on to describe issues encountered during work on the UK webwatch and a pilot study to explore archiving issues for project websites funded under the JISC's eLib programme. He also described the Internet Archive (www.archive.org/) which is building historical collections of web content. He concluded that measuring the size of the UK web is difficult but the experiences of web harvesting robot developers and web indexers will provide valuable information for archiving the UK web.

Comparisons with other international situations are important in this context, and Julien Masanes from the Bibliotheque nationale de France (BnF), gave the French perspective on these questions. In France the Government is currently in the process of modifying the law regarding legal deposit of online content. Masanes explored the issue of archiving the "deep web" generated by databases and mechanisms for improving the current generation of web harvesters. The BnF is currently researching the best way to manage procedures of selection, transfer and preservation, which could be applied on a large scale within the framework of the proposed new law. Two large-scale projects are proposed as part of this ongoing research. The first one has begun and involves sites related to the presidential and parliamentary elections that will take place in Spring 2002 in France. More than 300 sites have already been selected and the BnF collects about 30 Gb per week. The second project will be a global harvesting of the '.fr' domain in June.

If the sheer scale of the amount to be archived presents a major challenge, it is one that the BBC, with a million pages on its web site, and each regularly being updated, faces as a matter of course. Cathy Smith New Media Archivist of the BBC spoke about modernising the BBC archive to include its digital content and the huge logistical and legal problems that this can involve. The BBC's Charter responsibilities mean that it must archive its content, while its message boards, live chat forums, etc. mean that Data Protection becomes a serious issue in this context too. Multi-media content, often created through non-standard production processes, add further problems while proposals to extend the period within which the public can make formal complaints from one year to three years, has important consequences for the amount that will need to be archived. Ms Smith talked of the need to change perceptions from archiving to media management and for more pre-production emphasis on generating metadata and considering future re-use. She also emphasised the fact that the archive needs to recreate the look and feel of the original record since this was an important aspect of what it is that the BBC does.

A number of short reports from DPC members followed in the afternoon focussing on current initiatives and pilot projects. Stephen Bury of the British Library spoke of the BL Domain UK pilot project, the capture of 100 websites, and some of the criteria used by the BL in its current archiving activities, given the lack of legal deposit requirements. These criteria include topicality, and reflecting a representative cross-section of subject areas. Metrics of sites captured were provided, for example only 10% were "Bobby" compliant. Future developments would include scaling up the project and international and national collaborations.

Stephen Bailey, Electronic Records Manager for the Joint information Systems Committee (JISC) spoke of the JISC's efforts to implement its own recommendations in electronic records management and its current project of redesigning its own web site. The archive module of the new website will allow for identification and retention of key pages and documents and will also allow a greater degree of functionality for end users. Centralised control of the web records' lifecycle will allow for greater uniformity but will place demands on content providers. Future developments will include working in partnership on long-term preservation with the national archives and libraries and looking at preservation of the distributed JISC-funded project websites.

Steve Bordwell of the National Archives of Scotland asked whether we should even be attempting to preserve web sites in terms of look and feel, and whether we should rather be focussing on their content. He discussed their first work in the field, archiving snapshots of the Scottish Parliament website and the "Suckler Cow Premium scheme", a website based on an Oracle database with active server pages. They cannot preserve the whole application for the Suckler Cow site but will capture and preserve the dataset and use screencams to preserve the look and feel.

David Ryan of the PRO looked at the project to preserve the No. 10 web site from election day June 2001, and asked what an acceptable level of capture and functionality might be in terms of archiving and preservation procedures. Kevin Ashley of the University of London Computer Centre (ULCC), suggested that we need to think what the purpose of websites is precisely and what their significant properties are in order to formulate criteria for selection, capture, and preservation.

Robert Kiley spoke about the joint Wellcome Trust/JISC web archiving feasibility study and the specific part of this that is looking at archiving the medical Web. He emphasised what we are in danger of losing if no action is taken. Once again, the sheer volume of the medical Web presents significant problems for selection: quality would be one criterion, but how should we judge quality? In addition, many database are published only electronically, while discussion lists and e-mail correspondence are also potentially of immense importance to future generations of researchers. Are the next generation of Watson and Crick already communicating electronically via a public email forum and will this survive? He outlined key issues to be addressed by the consultancy including copyright, costs and the maintenance of any medical web archive.

Concluding discussion on the future way forward for the UK emphasised the value of sharing current approaches and technical developments on web archiving both internationally and within the UK. There are still many technical challenges including the preservation of database driven sites and the need for better tools for harvesting and archiving webpages. It was recognised that the scale of the task in the UK was significant and would require careful selection of sites as well as collaboration between organisations, to address it effectively. The DPC would be setting up further individual meetings between members to advance discussions initiated at the forum and to develop plans for scaling up current pilot activities.

End of Meeting Report

 

 

Presentations

Session 1

Web-archiving: an introduction to the issues (PDF 17KB) Catherine Redfern PRO (based on MA research)

Developing a French web archive (PDF 115KB) Julien Masanes Bibliotheque Nationale de France

The UK domain and UK websites (PDF 292KB) Brian Kelly UKOLN

Archiving the BBC website (PDF 15KB) Cathy Smith BBC

Session 2

Members' contributions

Stephen Bury (British Library) (PDF 17KB)

Steve Bailey (Joint Information Systems Committee) (PDF 16KB)

David Ryan (Public Record Office) (PDF 201KB)

Kevin Ashley (University of London Computer Centre) (PDF 10KB)

Robert Kiley (Wellcome Trust Library) (PDF 127KB)

Steve Bordwell (National Archives of Scotland) (PDF 45KB)

Read More

Digital Preservation Launch at House of Commons Launch

Added on 27 February 2002

Press Release Number Two - 27th February 2002

Coalition launches at House of Commons to secure the future of digital material

27th February 2002 Embargoed until 8pm 27.02.02.  Coalition launches at House of Commons to secure the future of digital material

The Digital Preservation Coalition (DPC) announced an action plan to ensure that the digital information we are producing is not lost to current and future generations.The key messages were:

Read More

Scroll to top