Maritime Archaeological Archives

   Critically Endangered small

These are collection of digital records from maritime archaeological work including photographs, maps and plans, field notebooks, post-excavation finds analysis and other analytical records.

Group:  Museum Data

Trend: New Entry

Consensus Decision

Added to List: New Entry

Last update: New Entry

Previous category: New Entry

Imminence of Action

Action is recommended within 12 months, detailed assessment is now a priority

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve

It would require a major effort to prevent losses in this group, such as the development of new preservation tools or techniques.

Examples

Records of excavations in marine environments which may fall outside the jurisdiction of terrestrial heritage services.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Poor documentation; lack of preservation mandate; dependence on proprietary and non-standard data types

‘Endangered’ in the Presence of Good Practice

Preservation planning from the outset; subject specialist repository; user community

2019 Review

This is a new entry taken from the open submission process in 2019.  It is grouped with Museum data sets as archaeological archives typically make their way to museums, but it is also closely aligned to research data.

Additional Jury Comments

There are trusted custodians of this data such as ADS, DANS or the British Museum as well as in oceanographic research agencies, but perhaps hard to integrate good practice at an international scale. The real challenge therefore is in identifying and sustaining a custodian as other bodies have experience with this data. The proliferation of innovative data recording technologies also implies likely problems of format dependence and documentation.

Read More

Data Posted to Defunct or Little-used Social Media Platforms

Data Posted to Defunct or Little-used Social Media Platforms

   Critically Endangered small

Older or less widely used social media platforms to which content has been uploaded but for which no guarantees have been made about the long term.

Digital Species: Social Media

Trend in 2022:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2023:

No change No Change

Previously: Critically Endangered

Imminence of Action

Immediate action necessary. Where detected should be stabilized and reported as a matter of urgency.

Significance of Loss

The loss of tools, data or services within this group would impact on many people and sectors.

Effort to Preserve | Inevitability

Loss seems inevitable: loss has already occurred or is expected to occur before tools or techniques develop.

Examples

BeBo, MySpace, Google Buzz.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Closure of platform; lack of offline equivalent; lack of export functionality; no preservation undertaking from service provider; unstable business plan from service provider.

‘Endangered’ in the Presence of Good Practice

Offline Replication; clear notice periods and alerts; committed ongoing maintenance of service.

2023 Review

The 2019 Jury revived this entry from initial submission in 2017 that they were not able to

assess at the time, added to the Bit List following the 2019 assessment to emphasize the different threats faced when attempting to preserve materials on older or defunct social media, emphasizing the different threats faced by social media users who uploaded content to defunct or little-used social media platforms. Because these services are older, the need to act is more urgent than for others. Often, the significance is only brought to attention once they are lost. The 2021 Jury noted a trend towards greater risk due to the existing risks of defunct or little-used platforms with recognition of the need to develop tools or techniques for applying to others that may follow the same path. The 2022 Taskforce agreed these risks remain on the same basis as before (‘no change’ to trend).

The 2023 Council agreed with the Critically Endangered classification and noted an increase in imminence and effort to preserve, recognizing that while the need for major efforts to prevent or reduce losses continues on the same basis as before, it is now much more likely that loss of material has already occurred, and will continue to do so, by the time tools or techniques have been developed. Therefore, immediate action is necessary.

Additional Comments

The 2023 Bit List Council additionally recommends that the next major review for the Bit List includes a consideration of merging this entry with the ‘Consumer Social Media Free at the Point of Use’ entry to provide examples of loss prompted by aggravating conditions.

The risk to this content depends on the specific service or platform, but older platforms (BeBo, MySpace) pose a higher risk of loss than current platforms (and is likely already lost) but social media wasn’t used to the same extent (and not as widely used by government, corporations, research institutions, etc.) in the early 2000s/2010s when these platforms were popular, which reduces the impact slightly.

When looking at the digital preservation landscape and where we need to apply effort as well as resources, defunct early social media spaces are not high on the list; but, when considering how contemporary social media channels could become defunct, it becomes a different conversation because of how intrinsically tied they are to political discourse and influencing political opinion

It is to be hoped that some of these have been archived via traditional web archiving, and so the remnants of these sites can be found in bits and pieces in various web archives, but it may be too late to save some of the content that is likely already lost. If some of this is still available, there may be hope in trying to preserve, but it may be difficult if the platforms are not willing to share data or work with preservationists. ArchiveTeam has stepped in here too. There is undoubtedly a story here that could be used as a call for arms to raise awareness about the preservation of current social media platforms too

Case Studies or Examples:

Read More

Digital Archives of Community Groups

Digital Archives of Community Groups

   Critically Endangered small

Digital materials including ephemera, correspondence and campaign materials created as a by-product of small scale or ad-hoc community action groups.

Digital Species: Community Archives

Trend in 2022:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2023:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques.

Examples

Archives of smaller and ad-hoc political and campaigning organizations; environmental protests; sports clubs; smaller religious groups; amateur music or drama; fan groups.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Poor documentation; lack of replication; lack of continuity funding; lack of residual mechanism; dependence on small number of volunteers, lack of preservation mandate; lack of preservation thinking at the outset; conflation of backup with preservation; conflation of access and preservation; inaccessible to web archiving; dependence on social media providers; distrust of ‘official’ agencies.

‘Endangered’ in the Presence of Good Practice

Residual archive with residual funding able to receive and support collections; active user community; intellectual property managed to enable preservation.

2023 Review

The Jury created this entry in 2019 as a subset of ‘Community Archives and Community-Generated Content’ which was split into two to provide greater specificity in recommendations for approaching the preservation of created as a by-product of small scale or ad-hoc community action groups (versus digital materials generated for significant purpose of a community initiative).

There was a 2020 trend towards greater risk based on community groups such as sports clubs, religious communities, arts and political groups, often relying on volunteer effort, being unable to meet for extended periods in 2020. Moreover, the local community centres, clubs or places of worship on which they depend have closed, in some cases for good.

This trend continued for 2021; the Jury commented that much of the content in community archives has easily preservable content just the resources are not directed towards them, basic digital preservation practices are not well embedded amongst the general population, and selective approaches are needed to get a handle on the situation and to find the resources to do the work.

The 2023 Council agreed with the classification of Critically Endangered and discussed an increase in the significance of loss due to the fact that community heritage tends to be part of wider conversations within the international landscape.

Additional Comments

The 2023 Council additionally noted that the entry currently contains a broad spectrum of very diverse types of materials each with different preservation considerations. For this reason, they recommend that the next major review for the Bit List includes a rescoping or splitting of this entry to allow for a deeper discussion of the preservation issues that exist within this spectrum.

Typically born digital material is more at risk - community groups may not know about the risk of loss. Many are unaware of digital preservation terminology. It is the ad-hoc nature of these groups and projects which is of great concern.

There is a significant need to raise awareness and provide a ‘home’ but also to do so with sufficient sensitivity so as to ensure community groups remain in control of their own material.

Communities who live in rural and remote areas may have a lack of access to services such as broadband connectivity, which is a well-reported issue and is often referred to as the “digital divide”. Inadequate internet connectivity would diminish the capacity for these communities to access digital preservation solutions, such as cloud storage for digital assets. This is especially prevalent with personal photos and videos on mobile phones as possession of a mobile phone does not necessarily mean the user has adequate internet connectivity to be able to upload videos to web-based platforms.

AI could potentially be used to assist with easy access to simple, succinct explanations and principles of digital preservation and archiving solutions which would give these communities a wider understanding of the work being done and empower them to be able to do minimum digital preservation themselves.

See also:

Read More

Digital Evidence and Records of Investigation Prior to Court

Digital Evidence and Records of Investigation Prior to Court

   Critically Endangered small

Digital materials assessed by police and other authorities in the course of investigation and retained as evidence of due process such as case files and correspondence, including materials not submitted to court.

Digital Species: Legal Data

Trend in 2022:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2023:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within twelve months, detailed assessment is a priority.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

It would require a major effort to prevent losses in this group, such as the development of new preservation tools or techniques.

Examples

CCTV; Email; 3d scanning; social media interactions; police records; court records; text messages.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Poor chain of custody; fragile or obsolete media; dependence on proprietary formats or products; lack or loss of documentation; inaccessible to web harvesting technologies; lack of version control; lack of integrity checks or integrity records; poor chain of custody.

‘Endangered’ in the Presence of Good Practice

Meticulous transfer and disclosure processes.

2023 Review

This entry was added in 2019 as an entry made in 2017 for ‘Digital Legal Records and Evidence,’ which the Jury split into four more discrete entries. This category includes evidence prior to court that may form part of an investigation or gathering of evidence but which are not formally submitted as evidence. It recognizes that police and other investigating authorities are not limited in the types of evidence that they need to administer, but that this creates an almost unbounded limit of preservation requirements to ensure authenticity and admissibility. A 2021 risk was identified based on examples bringing to question whether legal bodies have the skills and capabilities to preserve these materials should they need them if a case is reopened etc. The 2022 Taskforce found no significant trend towards greater or reduced risk.

The 2023 Council agreed with the Critically Endangered classification with the overall risks remaining on the same basis as before (no change to the trend).

Additional Comments

In the International organizations realm, more and more of these investigative missions are being set up. They are collecting huge volumes of data and the same issues around chain of custody, integrity records/checks continue to be aggravating especially with respect to authenticity and admissibility. Given the potentially huge volumes of data, and the drive to keep costs low, it is debatable whether there will be sustained funding for preservation.

Case files and correspondence are one thing: retention of these should be clear but may differ widely between jurisdictions and levels of government. If retention is not long-term or permanent, the risk of loss may not be so critical. Retention of 'unused' or 'potential' evidence is likely a different matter altogether. Is it even a record? Certainly, it is not a record of the court. Should it be returned to the suspect or accused? Are their rights being considered here - not just in terms of preservation, but also simply disposition? There may be legal and ethical issues around this that need to be fleshed out in conjunction with assessing its preservation risk.

I was talking about forensic techniques with some law enforcement types a while back. Police forces tend only to have the resources to maintain forensic capability with relatively recent technology - for older technology, institutions and specialist companies are the only sources of expertise. This has an impact on cold cases.

There have been many examples of convictions being overturned when previously unused evidence was brought to light. Therefore, the retention and preservation of unused evidence can have immense value.

Read More

Evidence in Court

Evidence in Court

 

 Critically Endangered small

Digital materials presented in court as evidence or documents such as rulings and proceedings generated through legal proceedings

Digital Species: Legal Data

Trend in 2022:

No change No Change

Consensus Decision

Added to List: 2017

Trend in 2023:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

It would require a major effort to address losses in this group, possibly requiring the development of new preservation tools or techniques.

Examples

Evidence submitted to courts of all kinds, including text messages, photography, CCTV, email, 3d and 2d scanning, scientific reports and analyses, documents and websites.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Loss of context; loss of integrity; external dependencies; poor storage; lack of understanding; churn of staff; significant or diversity of data; poorly developed specifications; ill-informed records management; poorly developed transfer protocols; poorly developed migration or normalization; longstanding protocols or procedures that apply unsuitable paper processes to digital materials.

‘Endangered’ in the Presence of Good Practice

Well managed data infrastructure; preservation enabled at ingest; carefully managed authenticity; use of persistent identifiers; finding aids; well managed records management processes; recognition of preservation requirements at highest levels; strategic investment in digital preservation; preservation roadmap; participation in digital preservation community.

2023 Review

This entry is a subset of an entry made in 2019 titled ‘Proceedings and Evidence in Court,’ which was itself created as a subset of entry in 2017 for ‘Digital Legal Records and Evidence,’ The 2021 Jury split ‘Proceedings and Evidence in Court’ into two more discrete entries to highlight their distinct preservation challenges and risk profiles. This entry includes evidence that has been presented as evidence in court. It was given a Critically Endangered classification to highlight its higher risk profile and additionally emphasize that courts are not limited in the types of evidence that they can admit but that they have a responsibility to provide robust preservation that ensures the authenticity of their records and evidence. The 2022 Taskforce found no significant trend towards greater or reduced risk

The 2023 Council agreed with the Critically Endangered classification with the overall risks remaining on the same basis as before (no change to the trend). They emphasized the importance that organizations with these materials should have identified preservation actions established in their workplan–for digital evidence of investigation prior to court–to put into practice within the next three years.

Additional Comments

Temporary courts are continuing to gradually close and decisions about preservation and management of their archives are being made hurriedly and at the last minute. Some of the decisions are placing materials at high risk due to; materials being split all over the place - including to entities with no capacity or capability to preserve them, a seeming lack of understanding that preservation and management of the archives has no completion date, an unwillingness to invest in preservation or a drive to keep costs low which is resulting in negative implications for preservation, hurried choices on preservation measures which are not allowing for proper testing of approaches to safeguard authenticity and legal admissibility (e.g. extracting digital data from complex systems in formats that can then potentially not be restored).

Standard Records Management processes within designated agencies should be able to take care of the preservation of materials like this but given that it is likely to involve complex types of data, such agencies may not be equipped to deliver preservation effectively. It is surprising that courts are not more prominent in the digital preservation community, where solutions now exist.

Case Studies or Examples:

  • For example, the Special Tribunal of Lebanon 14th Annual Report (2022-2023) touches on the above comments concerning the planning and approaches developed and agreed between the United Nations and the Government of Lebanon to guide the Special Tribunal to ensure the completion of the Tribunal’s residual functions, including the management and preservation of the records and archives of the Special Tribunal. Special Tribunal for Lebanon (2023) ‘Special Tribunal for Lebanon 14th Annual Report (2022 - 2023). Available at: https://www.stl-tsl.org/sites/default/files/documents/annual-reports/STL_Annual_Report_2022-2023.pdf [accessed 24 October 2023].

More concrete examples would be welcome. It is the evidentiary value of submissions to court that may be lost, and therefore veracity of the decision could be questioned. Evidence submitted in digital form is of greater risk (e.g., a video file submitted on a CD in the 90s) than records of the proceedings themselves (e.g., transcripts).

Read More

Legacy Research Web Collections

Legacy Research Web Collections

   Critically Endangered small

Research related collections of digital content on the web which are now outdated and/or no longer actively maintained. This can include software and published or unpublished source code.

Digital Species: Web, Research Outputs

Trend in 2022:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2023:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within twelve months, detailed assessment is a priority.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

Loss seems likely: by the time tools or techniques have been developed the material will likely have been lost.

Examples

Academic and institutional websites from the first decade of the web containing details of research projects and interests as well as research data.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Inaccessible to web archive; bespoke code; insufficient documentation; uncertain intellectual property right.

‘Endangered’ in the Presence of Good Practice

Secured by web archive; documentation and rights information published alongside.

2023 Review

This entry was added in 2019. There are overlaps with the entry with the ‘Semi-Published Research Data’ entry, and also ‘Unpublished Research Data,’ but it is a separate entry to distinguish between ‘current’ and ‘legacy’ collections with different risk profiles: in this case, the fact that materials of legacy web collections are no longer actively maintained increased the classification to Critically Endangered in comparison to Endangered Semi-Published Research Data. The 2021 Jury agreed with these distinctions, adding that loss has already occurred and future loss can be prevented through approaches such as web archiving and code preservation; however, risks had become greater notably over the preceding years due to security issues posed by hosting legacy technology software and services which prompted disposal of content imminently without adequate review or selection. Therefore, there was a 2021 risk towards increased risk to reflect this. The 2022 Taskforce agreed with this assessment, noting no change to the trend (it remained on the same basis as reported in 2021).

The 2023 Council agreed with the Critically Endangered classification and noted greater inevitability of loss compared to previous reviews. Additionally, the Council recommended that a nomination for consideration as a Bit List entry, for an entry on unpublished digital indices and transcriptions in the DIMEV Open-Access Digital Edition of the Index of Middle English Verse, would provide a valuable example to this entry rather than as a new, standalone entry.

Additional Comments

The 2023 Council additionally recommended that the next major review considers rescoping the entry, possibly splitting this entry into separate areas to assess different levels of risk relating to published and unpublished source code in legacy research web collections.

These collections are valuable but lose funding and care as institutions re-configure their tasks and individuals retreat from tasks due to retirement or (as volunteers) to old age.

There are an endless number of legacy research web resources out there that people don’t know about.

Not necessarily a technical challenge but a resource challenge.

The Internet Archive and other national web archiving bodies have copies of a lot of websites that would fit into this category but by no means all. There’s also a distinction between the software or code used to deliver the user experience and the data. Such code is secondary to the content.

This issue can be intensified by the legacy IT Infrastructure in cases where much of the content is hosted there, as security concerns may lead to disposal of content imminently. In these scenarios, their imminence of action becomes more urgent given the security issues posed by hosting legacy technology/software/etc.

Case Studies or Examples:

  • One example of an at-risk legacy research web collection, provided by the nominator of this entry, is the Unpublished digital indices and transcriptions in the DIMEV Open-Access, Digital Edition of the Index of Middle English Verse. The index comprises transcriptions made by a research team of Middle English text which were gathered as XML sheets and built upon a print publication: the Index of Middle English Verse (1943). These transcriptions involved significant financial and time investment and many are transcriptions of material unavailable online as digital facsimiles (uncertain data storage of the data that underlies the web resource, or whether it is being stored by a university or could easily be recovered). See Mooney, L., Mosser, D. Solopova, E., Thorpe, D., Hill Radcliffe, D., Hatfield, L., Cornelius, I. and Johnston, M. (n.d.) ‘The DIMEV: An Open-Access, Digital Edition of the Index of Middle English Verse’. Available at: https://www.dimev.net/ [accessed 24 October 2023]

  • The recovery of the VecNet archive of malaria-related publications offers another example that also has obvious public health implications. VecNet was founded in 2011 as a network of institutions assembled to address the concerns and recommendations of the Malaria Eradication Research Agenda initiative. It became a portal for malaria information and analysis tools, with the goal of extending present vector control interventions and enabling incorporation of additional interventions to achieve elimination. By 2019 an important component of the portal, the DataCite repository, ceased to be available. However, the Vector-Borne Disease Network Data Warehouse (VecNet-DW), a project of departments of University of Notre Dame and the Institute of Tropical Health and Medicine at James Cook University, retained the relevant data and is collaborating with Data Futures, which created the new Invenio repository. See Invenio (n.d.), ‘VecNet’. Available at: https://vecnet.nd.hasdai.org/ [accessed 24 October 2023].

  • Preserving the Carmichael Watson Research Project website at the University of Edinburgh: a case study on this project website, only online from 2013 until 2018, came to imminent risk of permanent loss and the strategy undertaken to transform it into a more sustainable format through web archiving and to revive its public accessibility. See Day Thomas, S. and Hawes, A. (2021) ‘Using ArchiveWeb.page to capture the Carmichael Watson Project’, Web Archiving & Preservation Working Group - General Meeting December 2021. Available at: https://www.youtube.com/watch?v=0CWMwJn6p-w [accessed 24 October 2023]

  • Fellgett, M. (2021) ‘Secure your digital datasets — by letting a data centre look after them!’, British Geological Survey Blogs. Available at: https://www.bgs.ac.uk/news/secure-your-digital-datasets-by-letting-a-data-centre-look-after-them/ [accessed 24 October 2023]

Read More

Media Inside Paper Files

Media Inside Paper Files

   Critically Endangered small

Media inside paper files occurred in records since the 1980s and will continue to do so for many years.

Digital Species: Portable Media

Trend in 2022:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2023:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques.

Examples

Digital media mixed with paper files in records offices and filing cabinets of almost every kind of enterprise.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Unsustainable effort to assess; exotic or obsolete media; poor storage; lack of descriptive labelling.

‘Endangered’ in the Presence of Good Practice

Carefully labelled; managed programme of assessment and retrieval; robust media used.

2023 Review

This entry was added in 2019 to report the significant amounts of digital media being transferred to archives folded into traditional files. The 2019 Jury noted that it is relatively simple to preserve this material once identified using standard tools, but it can be an ‘unknown unknown,’ and that assessment can seem overwhelming and, therefore it may overlap with other portable media risks but has a higher risk classification. The 2021 Jury agreed on a 2021 trend towards greater risk due to the increased time sensitivity and need for conducting collection audits as soon as possible, in order to determine what you have to then work out a plan about opening carriers, assessing files, and extracting them if significant.

The 2023 Council agreed with the risk classification of Critically Endangered with the overall risks remaining on the same basis as before (no change to the trend).

Additional Comments

This is highly dependent on who is looking after the portable formats. There are good examples, for example in libraries, where disks are stored at the back of books or in front of magazines and can be processed at the point of acquisition. In archives, however, dealing with bit-level preservation of external media (often on legacy formats) is largely an unquantified problem, and so resource commitments will not be in place. So, there is a method and tools but simply no time committed and no proper assessment either. In other agencies, the issue will not have even been considered, and for them, it will be much harder over time with some inevitable loss.

Read More

Non-current Hard Disk Technologies

Non-current Hard Disk Technologies

   Critically Endangered small

Materials saved to storage devices with a variety of underlying magnetic or solid-state technologies that are hardwired into a computer that is no longer under warranty or supported: typically, hard disks more than five years old.

Digital Species: Integrated Storage

Trend in 2022:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2023:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

Loss seems inevitable: loss has already occurred or is expected to occur before tools or techniques develop.

Examples

Disks installed into computers or servers that are more than five years old, or out of warranty.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Lack of replication; poor storage; non-standard connections or controllers; aggressive compression; encryption.

‘Endangered’ in the Presence of Good Practice

Maintenance schedule; renewable extendable warranty; best practice storage and operation; replication.

2023 Review

This entry was added in 2019 to ensure that the range of media storage is properly assessed and presented. The lifecycles of most consumer hard disk technology are relatively stable in comparison to portable devices because they are integrated into systems and therefore inherit the lifecycle and replacement of the entire system. This is less true at scale; however, where disks are used in storage arrays, and refreshment is more loosely tied to the server architecture. Storage at scale also means the percentage likelihood of finding a disk failure increases, and this likelihood of failure led to the 2021 Jury’s noted trend towards greater risk. It was reviewed in 2022 with no noted change towards even greater or reduced risk.

The 2023 Council agreed with the current Critically Endangered classification with overall risks remaining on the same basis as before (no change to the trend), while also noting a greater inevitability of loss from the discontinuation of support and development for these storage technologies when compared to the 2021 Jury review.

Additional Comments

A lot of early PCI-E flash devices (e.g. Fusion-IO) used proprietary drivers before the NVME standard was developed, but are now dropping off support. Intel has stopped development of Optane non-volatile RAM, some of which required specific CPU support to access although that form was usually used for data caching rather than storage.

Accessing drives with pre-SATA interfaces is increasingly difficult since interface cards and OS support can be hard to come by.

The greater density of newer disks, as well as encryption and compression, mean they can be more fragile than older disks with less density, and less sophisticated read/write technologies. The age of a disk is not the best or only indicator of its reliability.

Read More

Unpublished Research Data from Government Researchers

Unpublished Research Data from Government Researchers

 

 Critically Endangered small

Data sets and research outputs produced in the course of government research but never shared or made available outside of the initial research. In particular, the risk classification applies to research data under government embargo, restrictions due to sensitivities, classification issues, and/or materials suppressed for ideological reasons.

Digital Species: Research Outputs

Trend in 2022:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2023:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within twelve months; detailed assessment is a priority.

Significance of Loss

The loss of tools, data or services within this group would impact on many people and sectors.

Effort to Preserve | Inevitability

Loss seems likely: by the time tools or techniques have been developed the material will likely have been lost.

Examples

Data sets or research outputs produced for agencies that have closed or have had funding withdrawn from research initiatives, research data from government agencies that is no longer active.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Lack of access to archival services; sudden or unanticipated closure; loss of implicit knowledge from destabilized or demoralized staff; encryption.

‘Endangered’ in the Presence of Good Practice

Archival responsibility well developed; documentation; published through research channels.

2023 Review

This entry was added in 2019 under ‘Unpublished Research Data from US Govt Researchers’ It has significant overlaps with other entries in the research outputs group but was retained to draw attention to two realities: firstly that research outputs are not simply a matter for academic institutions, and that government is, in fact, a major producer of research data; and secondly that political instability and threats to the continuity of government services are a significant preservation risk. The 2019 entry description noted that while it related to the US, it did not mean that other jurisdictions are immune from political instability and commented that politically inconvenient research outputs face particular and immediate threats of which the digital preservation community should be cognizant.

The 2021 Jury agreed with this concern and broader applicability but recommended that this should be more explicit, and both title and description should be changed to broaden and include governments across national and international contexts. This change does mean that the risk profile will range and depend on the political system, the political change and the measures in place to save and reuse data from disbanded research projects; in other words, there may be instances where the unpublished research data in one country may fall under the Vulnerable category.

The 2023 Council agreed with the Critically Endangered classification but recommended getting an expert in this area for the next review. A further recommendation was made that this should not be an individual entry and instead be an example under the Unpublished Research Data entry.

Additional Comments

The US made the news as part of the last government, but this is probably an issue in other countries as well and is, therefore, a category that could be made more generic. One question to ask is whether the research data is considered of long-term value or considered ephemeral?

Read More

Web Domains with no Legal Deposit

Web Domains with no Legal Deposit

   Critically Endangered small

This entry regards the preservation of websites and domains that fall outside a remit of legal deposit (or no legal deposit mandate exists). Web archiving is able to capture large quantities of materials with routine and standards-based tools, but there are significant issues arising with intellectual property rights associated with website capture and republication. In many jurisdictions, but by no means all, those obstacles are overcome by regulations that enable a national library or other ‘legal deposit’ agency to copy and preserve content. Where no such permission exists, there is a significant risk of loss.

Digital Species: Web

Trend in 2022:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2023:

No change No Change

Previously: Critically Endangered

Imminence of Action

Immediate action necessary. Where detected should be stabilized and reported as a matter of urgency.

Significance of Loss

The loss of tools, data or services within this group would impact on many people and sectors.

Effort to Preserve | Inevitability

Loss seems inevitable: loss has already occurred or is expected to occur before tools or techniques develop.

Examples

Domains registered without a country code; domains with a country code but weak or unenforceable legal deposit permission to harvest.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Lack of legal deposit mandate or remit; rapid churn of websites; lack of access to Internet Archive harvest; contentious content; encryption; digital rights management; non-standard content management.

‘Endangered’ in the Presence of Good Practice

Permissive approach to Legal deposit; legislation to support and/or manage associated risks.

2023 Review

This entry was added in 2019. It is characterized by regulatory barriers rather than technical ones, though the pace of change in web technologies, as well as the growth of web content, mean that significant technical challenges still exist. The 2019 Jury noted that local conditions are also a significant factor. For example, websites often also fall under public records legislation or are important elements of corporate records: and so important parts of the web are harvested even when there is no explicit legal deposit legislation. Moreover, the Jury particularly recognizes the work of the Internet Archive to capture and preserve content. Even so, there are significant gaps in web archiving, and in too many cases, it is regulation that is the barrier. The 2021 Jury agreed with this description and classification but added that in some limited instances, pywb tools (as opposed to automated web crawlers like Heritrix) could effectively capture the look and feel of a platform interface, preserving legacy versions for users to interact with in the future. However, pywb tools are manual and therefore cannot address the scale of this issue. They also do not capture interfaces in a way that makes it possible to recreate them in the future, only interact with a defined set of web pages. For this growing issue of scale, the 2021 Trend was towards greater risk. The 2022 Taskforce noted no change to the trend.

The 2023 Council agreed with the Critically Endangered classification and noted an increase in imminence and inevitability of loss, recognizing that while the need for major efforts to prevent or reduce losses continues, it is now much more likely that loss of material has already occurred, and will continue to do so, by the time tools or techniques have been developed. While the Council agreed the entry description should be updated to reflect these areas of discussion, overall risks remain and continue on the same basis as before (no change to trend).

Additional Comments

There is not only a significant risk of loss to the content but also risk of loss of access. Unless the Internet Archive is picking these up, the early web or permission regimes are in place, and these early instances are gone forever and will continue to be lost.

Read More

Scroll to top