Digital Longevity: the lifespan of digital files (compiled for R&D in Digital Asset Preservation)

Print

By Julian Jackson

We know that photographic negatives, transparencies and prints last a long time. They are reliable forms of storing data. Recently the Royal Geographic Society reprinted Frank Hurley's pictures from the 1913 Antarctic Exhibition - from his original glass negatives, nearly 100 years old. An example of how robust the storage medium was - remember these negatives had been in sub-zero conditions and transported across an ocean in a tiny lifeboat!

In the headlong rush to put photographic images into digital form, little thought has been given to the problem of the longevity of digital files. There is an assumption that they will be lasting, but that is under question.

"There is growing realisation that this investment and future access to digital resources, are threatened by technology obsolescence and to a lesser degree by the fragility of digital media. The rate of change in computing technologies is such that information can be rendered inaccessible within a decade. Preservation is therefore a more immediate issue for digital than for traditional resources. Digital resources will not survive or remain accessible by accident: pro-active preservation is needed." Joint Information Systems Committee: Why Digital Preservation?

The 1086 Domesday Book, instigated by William the Conqueror, is still intact and available to be read by qualified researchers in the Public Record Office. In 1986 the BBC created a new Domesday Book about the state of the nation, costing £2.5 million. It is now unreadable. It contained 25,000 maps, 50,000 pictures, 60 minutes of footage, and millions of words, but it was made on special disks which could only be read in the BBC micro computer. There are only a few of these left in existence, and most of them don't work. This Domesday Book Mark 2 lasted less than 16 years.

Digital media have to be stored, and the physical medium they are stored on, for instance a computer's hard disk drive or a CD-rom have finite lifespans. But the primary problem is of obsolescence. Computer formats sink into oblivion very rapidly. Howard Besser, of the UCLA School of Education & Information Studies says: "Fifteen years ago Wordstar had (by far) the largest market penetration of any word processing program. But few people today can read any of the many millions of Wordstar files, even when those have been transferred onto contemporary computer hard disks. Even today's popular word processing applications (such as Microsoft Word) typically cannot view files created any further back than two previous versions of the same application (and sometimes these still lose important formatting). Image and multimedia formats, lacking an underlying basis of ascii text, pose much greater obsolescence problems, as each format chooses to code image, sound, or control (synching) representation in a different way."

If an image has been generated on negative or transparency, then scanned and transformed into a digital file, then the original is safe. However if it has been digitally originated, such as much of today's news and sport photography, then vital parts of our cultural heritage may be lost forever. This problem will get worse as more photography becomes completely digital.

The two aspects of the problem

The longevity problem can be divided into two questions the lifespan of the medium on which the file is stored, e.g. a CD-rom, and the obsolescence of the format: digital formats age quite rapidly because they are superseded by new formats, particularly if they are proprietary ones.

As the British Joint Information Systems Committee says: "Preservation is therefore a more immediate issue for digital than for traditional resources. Digital resources will not survive or remain accessible by accident: pro-active preservation is needed"

The key technical approaches for keeping digital information alive over time were first outlined in a 1996 report to the US Commission on Preservation and Access (Task Force 1996).

Both a migration approach and an emulation approach require refreshing.

This places a burden on individual photographers and small photolibraries, who have enough to contend with, with the rapid changing of their environment. That costly digital files might be unusable in a few years is a worrying thought. While TIFFs and JPEGs - because of their wide acceptance - are likely to be more resistant to becoming obsolete, nevertheless this will probably happen. Users of images need to be aware of this and have a plan to refresh the data onto new formats if necessary. This necessitates having good back-up copies to work from.

The role of meta-data

It has become clear to many institutions that there should be world-wide standards of data embedded in every file: who created it, when, what format, captioning and copyright information, for example. This would make access easier and also help preservation in the future. Although there are various institutions fumbling towards standards, of course creating one universal one will not be easy. A valuable project is the Dublin Core Metadata Initiative which is having workshops and projects to create various metadata standards.

http://dublincore.org/

The decay of physical media

Photographic materials tend to decay slowly over time, so you have enough warning to copy a treasured print, for example. Digital media tend to fully function, or not, and you have to open the file to find out. This adds another layer of uncertainty to the process. While this is the lesser of the two problems, it still has to be thought about. The lives of hard drives or CD-RWs is somewhat speculative. In the case of the latter, accelerated lifespan tests have been done, but we still have incomplete data, as the medium is quite new. It would seem wise to backup vital data on two different media, for instance a hard drive and a CD-RW until more is known.

This is a complex problem. Howard Besser at UCLA seems to be one of the best sources for further information:

Information Longevity http://sunsite.berkeley.edu/Longevity

Besser, Howard. Digital longevity,

Besser, Howard. Longevity of Electronic Art,

Task Force on the Archiving of Digital Information

Other sources:

Journal of Electronic Publishing: http://www.press.umich.edu/jep/

Joint Information Systems Committee http://www.jisc.ac.uk.

Sepia (European group investigating preservation of photos) http://www.knaw.nl