Web Domains with no Legal Deposit

   Critically Endangered small

This entry regards the preservation of websites and domains that fall outside a remit of legal deposit (or no legal deposit mandate exists). Web archiving is able to capture large quantities of materials with routine and standards-based tools, but there are significant issues arising with intellectual property rights associated with website capture and republication. In many jurisdictions, but by no means all, those obstacles are overcome by regulations that enable a national library or other ‘legal deposit’ agency to copy and preserve content. Where no such permission exists, there is a significant risk of loss.

Digital Species: Web

Trend in 2022:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2023:

No change No Change

Previously: Critically Endangered

Imminence of Action

Immediate action necessary. Where detected should be stabilized and reported as a matter of urgency.

Significance of Loss

The loss of tools, data or services within this group would impact on many people and sectors.

Effort to Preserve | Inevitability

Loss seems inevitable: loss has already occurred or is expected to occur before tools or techniques develop.

Examples

Domains registered without a country code; domains with a country code but weak or unenforceable legal deposit permission to harvest.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Lack of legal deposit mandate or remit; rapid churn of websites; lack of access to Internet Archive harvest; contentious content; encryption; digital rights management; non-standard content management.

‘Endangered’ in the Presence of Good Practice

Permissive approach to Legal deposit; legislation to support and/or manage associated risks.

2023 Review

This entry was added in 2019. It is characterized by regulatory barriers rather than technical ones, though the pace of change in web technologies, as well as the growth of web content, mean that significant technical challenges still exist. The 2019 Jury noted that local conditions are also a significant factor. For example, websites often also fall under public records legislation or are important elements of corporate records: and so important parts of the web are harvested even when there is no explicit legal deposit legislation. Moreover, the Jury particularly recognizes the work of the Internet Archive to capture and preserve content. Even so, there are significant gaps in web archiving, and in too many cases, it is regulation that is the barrier. The 2021 Jury agreed with this description and classification but added that in some limited instances, pywb tools (as opposed to automated web crawlers like Heritrix) could effectively capture the look and feel of a platform interface, preserving legacy versions for users to interact with in the future. However, pywb tools are manual and therefore cannot address the scale of this issue. They also do not capture interfaces in a way that makes it possible to recreate them in the future, only interact with a defined set of web pages. For this growing issue of scale, the 2021 Trend was towards greater risk. The 2022 Taskforce noted no change to the trend.

The 2023 Council agreed with the Critically Endangered classification and noted an increase in imminence and inevitability of loss, recognizing that while the need for major efforts to prevent or reduce losses continues, it is now much more likely that loss of material has already occurred, and will continue to do so, by the time tools or techniques have been developed. While the Council agreed the entry description should be updated to reflect these areas of discussion, overall risks remain and continue on the same basis as before (no change to trend).

Additional Comments

There is not only a significant risk of loss to the content but also risk of loss of access. Unless the Internet Archive is picking these up, the early web or permission regimes are in place, and these early instances are gone forever and will continue to be lost.


Scroll to top