Endangered large

Documents presented in PDF (Portable Document Format) format (ISO 32000:1 and ISO 32000:2) and other data wrapped inside them, other than PDF/A but including all other variants and versions.

Group: Formats

Trend in 2021:

Consensus Decision

Added to List: 2017

Trend towards reduced risk

Previous classification: Endangered

Imminence of Action

Action is recommended within five years, detailed assessment within three years.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world..

Effort to Preserve

It would require a small effort to address losses in this group, requiring the application of proven preservation tools or techniques.

Examples

PDF 1.1, 1.2, 1.3, 1.4 (excluding PDF/A as a subset), 1.5, 1.6, 1.7 and 2.0. PDF/X and PDF/E

‘Critically Endangered’ in the Presence of Aggravating Conditions

Loss of context; loss of authenticity or integrity; external dependencies; poor storage; lack of understanding; significant diversity of data; poorly developed digitization specifications; lack of integrity checking; poorly developed migration or normalizations specifications; lack of virus control; poor storage or replication; lack of validation at the point of creation; encryption.

‘Vulnerable’ in the Presence of Good Practice

Well managed data infrastructure; preservation planning; authenticity managed; use of persistent identifiers; reduction of dependencies; application of records management standards; recognition of preservation requirements beyond formats; strategic investment in digital preservation; preservation roadmap; participation in digital preservation community; format validation.

2021 Review

The 2019 Jury introduced this entry as a subset of a previous entry for ‘PDF’, emphasizing the different threats faced by different types of PDF. PDF/A explicitly reduces dependencies and thus curtails preservation risks for certain types of content: PDFs of other types do not. PDF and PDF/A have sometimes been misunderstood as a generic solution to digital preservation requirements. In the eyes of the Jury it can only offer a preservation solution when embedded within a wider preservation infrastructure. The 2021 review agreed, noting a 2021 trend towards reduced risk as PDF continues to be a pretty stable format and there continue to be developments in tools and techniques (e.g. ability to convert PDF to PDFA to reduce dependencies) but no change to the endangered classification given the need for support and embedding in a preservation infrastructure.

Additional Jury Comments

There is a lot of material produced and kept in PDF. Some of it is authoritative, in other words, the only available copy, while some of it is not. However, if it is the only copy and it is lost, it can have an impact on a lot of people

The challenge in evaluating the significance and impact of the loss of PDFs is that they're quite often a surrogate of something else, whether a digitized record or a Word document, etc. Whether or not that record is retained may be a factor. We should also be considering PDF Portfolios, which are an extension of PDF 1.7. Portfolios contain embedded files and can include text documents, spreadsheets, PowerPoints, emails, Computer Aided Design (CAD) drawings. Assessing the risk of this complex format may need to be separate from other PDFs.

See also: Fanning, B (2017) Preserving with PDF/A (Second Edition), DPC Technology Watch Report 17-01 online at http://doi.org/10.7207/twr17-01.


Scroll to top