DPC

Collections Information Management Data and Systems

Collections Information Management Data and Systems

 

 Endangered large

Descriptive information and data, covering both the systems (databases) and the data they contain. This includes information made publicly available, and information only available for internal use.

Digital Species: Museum and Gallery

New Rescoped Entry

Consensus Decision

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools or services within this group would have a global impact.

Effort to Preserve | Inevitability

It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques.

Examples

Covered under this entry are third-party and in-house collections information management systems and databases, current and legacy, both large and small (e.g., Microsoft Access, FileMaker Pro).

‘Critically Endangered’ in the Presence of Aggravating Conditions

Poor or no documentation; lack of technical and preservation infrastructure; complex interdependencies on specific hardware, software or operating systems; significant volumes or diversity of data; conflation of access with preservation; dependence on proprietary products; lack of preservation capacity in museum or gallery; poorly developed or no processes for migration or normalization.

‘Vulnerable’ in the Presence of Good Practice

Strong documentation; preservation capability; strong repository and preservation technical infrastructure; good descriptive cataloguing; use of open formats and open source software; considered data management planning; licencing that enables preservation.

2023 Review

This entry was added in 2019 under ‘Digital Materials in Museums and Galleries’ and previously rescoped in 2021 to ‘Supporting Digital Materials for Museums and Galleries’.

The 2023 Bit List Council superseded the entry, splitting it into six discrete entries as the scope of the single entry was too broad to provide the guidance needed. The recommendation to break this entry down was also made by the 2021 Jury, as the types of digital collections content in museums can be vast and offer particular risks in museum and gallery contexts. For this entry on collections information management data and systems, context is important. For example, smaller 'in-house developed' and 'cottage industry' systems may be at higher risk than larger third-party systems with significant international buy-in and support.

Additional Comments

The 2023 Council additionally recommended that the next major review consider whether or not to split out the data held in Collections Information Management Systems from the systems themselves.

Databases and catalogues can have a knock-on effect. The information they contain is valuable for contextualizing and understanding the resources they describe. Without them, meaning may be lost even if bits are not.

Read More

Oral Histories

Oral Histories

 

 Endangered large

Oral histories including both audio and audiovisual (video and sound), and their accompanying transcripts and/or time-pointed summaries.

Digital Species: Museum and Gallery, Community Archives, Sound and Vision

New Rescoped Entry

Consensus Decision

Imminence of Action

Action is recommended within five years, detailed assessment within three years.

Significance of Loss

The loss of tools, data or services within this group would impact on many people and sectors.

Effort to Preserve | Inevitability

It would require a small effort to preserve materials in this group, requiring the application of proven tools and techniques.

Examples

Examples are wide ranging but can generally include born-digital or digitized material produced as an output of oral history projects; video or oral histories; transcripts, summaries, and other accompanying materials.

‘Critically Endangered’ in the Presence of Aggravating Conditions

Poor documentation; external dependencies; storage on old or degrading media; storage on consumer portable media; lack of preservation planning; lack of sustained funding; lack of ongoing investment in changing preservation requirements; lack of capability; poor documentation; dependence on small staff or volunteer resources; lack of standardized file naming.

‘Vulnerable’ in the Presence of Good Practice

Preservation capability; high quality storage; meticulous and consistent replication; stored in a trusted repository; preservation requirement understood; intellectual property managed to enable preservation; good descriptive cataloguing; persistent identifiers.

2023 Review

This entry was added in 2019 under ‘Digital Materials in Museums and Galleries’ and previously rescoped in 2021 to ‘Supporting Digital Materials for Museums and Galleries’.

 The 2023 Bit List Council superseded the entry, splitting it into six more discrete entries as the scope of the single entry was too broad to provide the guidance needed. The recommendation to break this entry down was also made by the 2021 Jury, as the types of digital collections content in museums can be vast and offer particular risks in museum and gallery contexts. Approaches to preservation are dependent on whether these oral history recordings are on analogue and digital portable media (e.g. external hard disk drives, audio or video tapes), or are in a somewhat managed networked environment. If held on portable media, guidance for portable media should be followed.

Additional Comments

The 2023 Council agrees with the 2021 Jury Review recommendations that Museum & Gallery entries require further rescoping. In regards to this entry, the 2023 Council recommends that a future review should further rescope of Oral Histories and Research Materials and Outputs due to overlaps/cross referencing which, due to time constraints, was unable to be done for the 2023 review cycle.

There may be a need for clarifying what falls under oral histories in the context of preservation at the organization - whether it includes audio and/or video recordings recorded for the purposes of creating oral history recordings (to be added to an organization’s collection), or for internal-only use. In addition, there may be some misidentification of oral history recordings, where the intent may have been to capture the recording as a research interview or as vox pops.

See also:

  • Two resources that may be helpful for remote recording include the Oral History NSW ‘Recording remote Oral History interviews’ page and the Oral History Society ‘Remote oral history interviewing’. See Oral History NSW (n.d.) ‘Recording remote Oral History interviews’. Available at: https://www.oralhistorynsw.org.au/equipment-remote [accessed 24 October 2023] and Morgan, C. Perks, R., Stewart, M. and Johnston, C. (2021) ‘Remote oral history interviewing’, Oral History Society. Available at: https://www.ohs.org.uk/covid-19-remote-recording/ [accessed 24 October 2023]

Read More

Research Materials and Outputs in Museums and Galleries

Research Materials and Outputs in Museums and Galleries

   Endangered large

Digital material used in, or resulting from, research carried out on materials, digital or otherwise, held in galleries, museums, or similar. Research outcomes may not be formally published, and supporting datasets may not be formally accessioned or archived by an organization or a related organization. Access to these research materials and outcomes may only be made available for internal use, to inform other public outcomes, or for individual researchers.

Digital Species: Museum and Gallery, Research Outputs

New Rescoped Entry

Consensus Decision

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve

Loss seems likely: by the time tools or techniques have been developed the material will likely have been lost.

Examples

Unpublished or published research papers, datasets, databases and other supplementary materials.

‘Critically Endangered’ in the Presence of Aggravating Conditions

Lack of documentation; lack of clarity with respect to intellectual property; unstable funding for repository; external dependencies.

‘Vulnerable’ in the Presence of Good Practice

Strong data management planning; preservation capability; good documentation; deposit into trusted repository.

2023 Review

This entry was added in 2019 under ‘Digital Materials in Museums and Galleries’ and previously rescoped in 2021 to ‘Supporting Digital Materials for Museums and Galleries’.

The 2023 Bit List Council superseded the entry, splitting it into six discrete entries as the scope of the single entry was too broad to provide the guidance needed. The recommendation to break this entry down was also made by the 2021 Jury, as the types of digital collections content in museums can be vast and offer particular risks in museum and gallery contexts. This entry on Research Materials and Outputs within the scope of Museums and Galleries differs from the ones found in Research Outputs, with the latter focus around institutional supporting higher education institutions but lacking for museums and gallery contexts.

Additional Comments

The 2023 Council agrees with the 2021 Jury Review recommendations that Museum & Gallery entries require further rescoping. In regards to this entry, the 2023 Council recommends that a future review should further rescope of Oral Histories and Research Materials and Outputs due to overlaps/cross referencing which, due to time constraints, was unable to be done for the 2023 review cycle.

This research may be publicly or philanthropically funded. While research materials - used and/or developed in the course or research - and research outputs may not be made publicly available, they may be used to inform other outputs e.g. exhibition, interpretation, conservation, etc.

Exhibition catalogues and interpretation of collections are often published online in research papers.

Read More

Maritime Archaeological Archives

   Critically Endangered small

These are collection of digital records from maritime archaeological work including photographs, maps and plans, field notebooks, post-excavation finds analysis and other analytical records.

Group:  Museum Data

Trend: New Entry

Consensus Decision

Added to List: New Entry

Last update: New Entry

Previous category: New Entry

Imminence of Action

Action is recommended within 12 months, detailed assessment is now a priority

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve

It would require a major effort to prevent losses in this group, such as the development of new preservation tools or techniques.

Examples

Records of excavations in marine environments which may fall outside the jurisdiction of terrestrial heritage services.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Poor documentation; lack of preservation mandate; dependence on proprietary and non-standard data types

‘Endangered’ in the Presence of Good Practice

Preservation planning from the outset; subject specialist repository; user community

2019 Review

This is a new entry taken from the open submission process in 2019.  It is grouped with Museum data sets as archaeological archives typically make their way to museums, but it is also closely aligned to research data.

Additional Jury Comments

There are trusted custodians of this data such as ADS, DANS or the British Museum as well as in oceanographic research agencies, but perhaps hard to integrate good practice at an international scale. The real challenge therefore is in identifying and sustaining a custodian as other bodies have experience with this data. The proliferation of innovative data recording technologies also implies likely problems of format dependence and documentation.

Read More

Media Art by Deceased Artists or Defunct Workshops

Media Art by Deceased Artists or Defunct Workshops

   Critically Endangered small

Media art where the artists or creative technicians are either deceased or not able to provide guidance on authenticity and installation.

Digital Species: Media Art

Trend in 2022:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2023:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

It would require a major effort to prevent or reduce losses in this group, including the development of new preservation tools or techniques.

Examples

Works produced by media artists now deceased, such as: Jeremy Blake, Beatriz Da Costa, Heiko Daxl or Stanislaus Ostoja-Kotkowski.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Lack of documentation to enable maintenance; lack of clarity with respect to intellectual property; complex interdependencies on specific hardware, software or operating systems; lack of capacity in the gallery or workshop; lack of strategic investment; complex external dependencies; loss of institutional memory resulting from staff churn; poor working relationship between the gallery and artist/workshop; lack of conservation assessment.

‘Endangered’ in the Presence of Good Practice

Strong documentation; clarity of preservation path and ensuing responsibilities; proven preservation plan; capacity of workshop to support re-installation; capacity of gallery to conserve; capacity of gallery to re-install; retention of institutional memory including archives of correspondence between gallery and artist/workshop; strong and continuing working relationship between the gallery and artist/workshop; regular conservation assessment.

2023 Review

This entry was added in 2019 as a subset of the 2017 ‘Media Art,’ which was first introduced with particular reference to historical media art but split by the 2019 Jury to ensure greater specificity in its recommendation. This entry represents works held in galleries where the artist is deceased or the workshop has closed, and there is limited prospect to obtain new documentation. The 2020 Jury found a trend towards greater risk based on how galleries, which often rely on visitors for income, have been closed for extended periods and circumstances of economic dislocation. The 2021 Jury agreed on a continued trend towards greater risk based on the increasing risk of this loss happening with more time sensitivity for early media artworks.

The 2023 Council agreed with the Critically Endangered classification with overall risks remaining on the same basis as before (no change to the trend).

Additional Comments

This entry includes a point in the lifecycle of all media art, so good practice recommendations are likely to become more important over time. Preservation issues may not become visible until the piece is brought out of storage for loan or exhibition, underscoring the value of continuous or periodic conservation assessment. The range of data/formats/hardware/software etc. can be new and varied, providing organizations with an ongoing technical challenge that they are not initially equipped to deal with. Some loss seems inevitable.

Preservation of legacy media artworks is dependent on access to obsolete technology and also the knowledge of how to operate said technology. Documentation around the production process and artist intent can be limited and more critical without any access to artists or technicians. This creates risk around the preservation of a truly authentic artwork.

Case Studies or Examples:

  • Resources and outputs from the Preserving and Sharing Born Digital and Hybrid Objects From and Across The National Collection project. See V&A Research Projects (n.d.) ‘Preserving and Sharing Born Digital and Hybrid Objects’. Available at: https://www.vam.ac.uk/research/projects/preserving-and-sharing-born-digital-and-hybrid-objects [accessed 24 October 2023].

  • This includes decision model work around acquisition of complex collections such as born digital and hybrid art. See Ensom, T, and McConnachie, S. (2022) ‘Preserving and sharing born-digital and hybrid objects from and across the National Collection’, Decision Model Report: March 2022. Available at: http://doi.org/10.5281/zenodo.7097489

See also:

  • NEW MEDIA MUSEUMS: Creating Framework for Preserving and Collecting Media Arts in V4, initiated by the Olomouc Museum of Art as a joint international platform for sharing experience with building and maintaining collections of new media artworks across different types of institutions. The aim of the project is to find workable methods for heritage institutions to build and maintain collections of media arts, which are necessary for safeguarding this area for the benefit of society. See Central European Art Database (2021) ‘NEW MEDIA MUSEUMS: Creating Framework for Preserving and Collecting Media Arts in V4’. Available at: http://cead.space/Detail/projects/3797 [accessed 24 October 2023].

  • The Collaborative Infrastructure for sustainable access to digital art LIMA project, to prevent the loss of digital artworks and to commonly develop the knowledge to preserve these works in a sustainable way. The project ‘Infrastructure sustainable accessibility digital art’ invests in research, training, knowledge sharing and conservation to prevent the loss of both digital artworks and the knowledge to preserve them. See LIMA (n.d.) ‘Collaborative infrastructure for sustainable access to digital art’. Available at: https://www.li-ma.nl/lima/article/collaborative-infrastructure-sustainable-access-digital-art [accessed 24 October 2023].

  • Ellis, T. (2023) ‘Saving Stan: Preserving the Digital Artwork of Joseph Stanislaus Ostoja-Kotkowski’, iPRES 2023 Conference, Urbana-Champaign, Illinois, USA, 19–22 September.

Read More

Grey Literature

Grey Literature

   Critically Endangered small

Semi-published research outputs such as blogs, dissertations, informal conference papers or commissioned reports which are not formally published but which can contain original and insightful contributions within scholarly communications. This entry covers a wide spectrum of very diverse types of materials which all have different preservation considerations.

Digital Species: Research Outputs

Trend in 2022:

No change No Change

Consensus Decision

Added to List: 2019

Trend in 2023:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

Loss seems likely: by the time tools or techniques have been developed the material will likely have been lost.

Examples

Blogs, technical reports, conference papers, dissertations, commercial research.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Originating researcher no longer active or changed research focus; staff on temporary contracts; dependence on single student or staff member; weak or fluid institutional commitment to subject matter; weak institutional commitment to data sharing; complicated or contested intellectual property; encryption; Lack of recognition; non-disclosure agreements.

‘Endangered’ in the Presence of Good Practice

Use of persistent identifiers; embedded within repository infrastructure; quality assurance.

2023 Review

This entry was introduced in 2017 under ‘Research Data,’ though without explicit reference to grey literature. In 2019, the Jury split this entry into a range of contexts for research outputs. This entry represents activities which build towards formal publications and research outputs but which do not typically accumulate in institutional repositories. The 2021 Jury agreed; however, there was a significant difference between the 2020 trend and the 2021 trend. The 2020 Jury noted a trend towards greater risk because higher education and research institutions faced budget uncertainties, and a number of institutions have introduced early severance schemes or put staff on short term contracts at greater risk of redundancy; While this puts other types of research output at risk, the ad hoc nature of grey literature means that this entry is at greater risk. Members of the 2021 Jury argued the content of grey literature is not entirely unique if it eventually makes its way into published outputs and noted improvements and initiatives towards preservation of semi-published research data and outputs over the last year, resulting in the consensus of a 2021 trend towards reduced risk.

The 2023 Council agreed with the Critically Endangered classification and noted that there will always be an element of risk to materials under this entry due to its semi-official nature. The Council also noted that this entry covers a wide spectrum of material and all had different preservation considerations.

Additional Comments

Loss of material like this would be common in the analogue world, but in the digital age, we have the capacity and perhaps something of a responsibility to ensure that it is captured: more of an opportunity lost to extend the available research resource. The ADS’s Grey Literature Library demonstrates what could be done if information architectures are deployed to mirror and extend professional practice.

Workflows and policies regarding tagging, collecting and EDRMS may help protect such data into the future. Past materials are almost certainly partially lost.

Not all funder-maintained specialist repositories accept grey literature for long-term storage (e.g., UKRI-NERC EDS). These are redirected to generic open data depositories such as Zenodo which mint DOIs but do not offer data quality assurance for different data types.

See also:

  • The Policy Commons has a mission to index and preserve grey literature from IGOs, NGOs, think tanks, governments and, to date, indexing and preserving around 4 million items from c.11,000 institutions from across the world. See Policy Commons (n.d.) Available at: https://policycommons.net/ [accessed 24 October 2023]

Read More

Consumer Social Media Free at the Point of Use

Consumer Social Media Free at the Point of Use

   Critically Endangered small

Social media services offered free at the point of use with a subscription model based on reselling user behavior and/or advertising. This entry broadly includes digital content created, shared and hosted on social media platforms as well as interfaces of social media platforms.

Digital Species: Social Media

Trend in 2022:

increased riskTowards even greater risk

Consensus Decision

Added to List: 2017

Trend in 2023:

increased riskTowards even greater risk

Previously: Critically Endangered

Imminence of Action

Immediate action necessary. Where detected should be stabilized and reported as a matter of urgency.

Significance of Loss

The loss of tools or services within this group would have a global impact.

Effort to Preserve | Inevitability

Loss seems likely: by the time tools or techniques have been developed the material will likely have been lost.

Examples

Instagram, Facebook, X (previously Twitter), Pinterest, Yahoo Groups, Parler, Truth Social, Reddit, Mumsnet.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Lack of preservation capacity in provider; Lack of preservation commitment or incentive from provider; proprietary products or formats; poor data protection; inaccessibility to web archiving; political or commercial interference; Lack of offline equivalent; super-abundance; poorly managed IPR; Lossy compression in upload scripts.

‘Endangered’ in the Presence of Good Practice

Offline backup and documentation of media assets; Migration plan; Early warning from vendors; Roadmap from vendors; Accessible to web harvest; Suitable export functionality; Licencing enables preservation; Preservation commitment from vendor; Preservation capability in vendor;

Resilient to hacking; Selection criteria.

2023 Review

This entry was added by the 2019 Jury as a subset of a broader social media entry first introduced in 2017. It was created as a standalone entry to draw attention to the different threats faced by online services that are paid for versus ‘free at the point of use’ (both depend on the business model of the vendor and the terms which they impose). The 2021 Jury raised the risk classification from Endangered to Critically Endangered based on concerns arising with trends towards harmful and malicious hate speech as well as misinformation and deliberate deletion. The 2022 Taskforce agreed on a trend towards even greater risk based on the continued, significant trend towards hate speech, misinformation and disinformation, and deliberate deletion in light of ongoing global conflicts that include (but are not limited to) social and economic inequalities and climate change. In particular, they mentioned the sale of Twitter prompting a moment of instability in consumer social media, with the scale of Twitter, evident acrimony between parties prior to the sale and the hostile news coverage afterward, elevating the risks associated with social media. They also brought to attention issues surrounding platforms enabling extreme views not permitted on mainstream platforms, which emerged and proliferated noticeably and, from a preservation standpoint, could be argued are potentially at very high risk, and historically significant

Based on the assessment of the rescoped entry, the 2023 Council agreed on the Critically Endangered classification and noted an increase in imminence of action required as well as the effort to preserve. The need for major efforts to prevent or reduce losses continues, but it is now much more likely that loss of material has already occurred, and will continue to do so, by the time tools or techniques have developed. There is a greater urgency to prioritize the assessment of these materials and develop tools or techniques to prevent or reduce further losses in this group.

The 2023 Council recommends further rescoping and adjusting of this and other social media entries in light of how web-based and cloud-based business products and services have developed in recent years. This included:

  • Clarifying the scope. This entry broadly refers to the preservation of content and interfaces of social media platforms, with these platforms designed to facilitate the creation and sharing of media through interactive social networks. These services, particularly those provided by largely unregulated (or underregulated) platforms, pose critical risks for not only capturing and preserving the content hosted on the social media platform but also the interfaces of the platforms themselves.

  • Similarly, the entry specifically refers to risks for digital materials created, shared, and hosted via social media services offered ‘free at the point of use,’ in which the business model and sustainability can only be guessed, and contracts tend to be asymmetrical in favour of the supplier. Moreover, because these services have a low barrier to entry, they may be favoured by agencies or individuals least able to respond to closure or loss.

  • As part of this rescoping, relevant information concerning cloud-based aspects were incorporated into the ‘Cloud-based Services and Communications Platforms’ entry to more clearly differentiate the risks associated with cloud hosting and computing technologies, allowing this entry on consumer social media free at the point of use to focus on challenges, notably those relating to harvesting and managing content and interfaces of web-based social networking platforms.

Additional Comments

The 2023 Bit List Council additionally recommends that the next major review for the Bit List includes:

  • A restructure and splitting of the entry to create separate entries for ‘digital content hosted on social media platforms’ and for ‘interfaces of social media platforms’, where each can be teased out to provide greater clarity about specific risks, aggravating factors and recommended actions. This should include expanding on API access to data, providing examples of legacy content already lost, and pointing to examples where risk is especially high (e.g., things that are still up but alarmingly fragile!)

  • A consideration of merging the ’Data Posted to Defunct or Little-used social media platforms’ entry with this entry, to incorporate examples of loss in the presence of aggravating conditions.

  • A consideration of merging the ‘Born Digital Photos and Video Shared on Social Media’ entry with this entry, to provide examples of particular types of digital content hosted on social media platforms that are lost or at risk. This is mostly due to the fact that so many of the ‘regular’ social media platforms have tended toward more ways to mimic or copy TikTok style videos, and making the distinction will become harder in the future since they all have similar functionality and ways to create photo/video content.

  • A consideration of merging the ‘Legacy Interfaces and Services’ entry with this entry, to provide examples of particular interfaces of social media platforms that are lost or at risk.

Social media free-at-the-point-of-use remains at a critical risk due in large part to the policies of unregulated (or underregulated) corporate platforms such as Facebook, X (previously Twitter), and their parent companies. The content shared on these platforms and the history of the development of platform infrastructure and policy itself provide a critical source of information for policy-makers and researchers. The complete lack of preservation provision and deliberate obstruction of archiving attempts for public interest puts this valuable content at high risk of loss and draws attention to the critical risk posed by these examples of platforms.

Content hosted on social media platforms (that users might not have stored elsewhere) is at risk and users may lose the opportunity to keep their own data for personal archiving or to donate to an organization. Collecting organizations may lose the opportunity to archive hosted content within their collecting remit using web or API harvesting tools. In both instances, data remains at high risk because it is hosted by companies that could change policies or access on a whim. Also, the inability to archive even free content unless you have a login as an archivist (like with Browsertrix). Additionally, there are social media companies requiring payment to access data for preservation.

There are interfaces of social media platforms that researchers may want to see to study the evolution of the platforms over time (through web harvesting typically) are at risk. Preservation is affected by researcher API access being shut down, halting preservation of entire platforms. There are also differences between themes/collecting policies of institutions and researchers who are scraping their own data and depositing in repositories.

Preserving this stuff en masse is still incredibly difficult, but many of these platforms allow the downloading of their own personal content/archives. However, these lose all the context of social media and therefore, whilst they do preserve the data, they do not preserve the essence of the material. Platforms like X (previously Twitter) have both opened and closed their API further in recent years, but others like Yahoo have closed, and Facebook as well as X (formerly Twitter) continues to be almost hostile towards archiving and preservation attempts.

With digital materials from premium or institutional social media services, the business model and sustainability are more obvious, and contracts may be enforceable more readily. Moreover, because these services have a slightly higher barrier to entry, they may be favoured by agencies better able to respond to closure or loss. Traditional web archiving can be employed where the user pays for a service, but the content is ultimately publicly available (such as Flickr). But much is unclear about how to preserve internal social media / closed networks that web archiving cannot get to or existing tools do not cover.

Social media capture via web harvesting has become increasingly difficult. Social media platforms have done nothing to address the barriers to automated capture that prevent the preservation of even so-called public content. For example, campaign websites or other election-related content that is only published on Facebook or on X (previously Twitter) because these services are ‘free.’ This content is of particular concern as it appears on no other website. Web archivists are constantly shifting strategies and approaches and trying out new (but limited) tools to best capture this content. If we cannot successfully preserve these platforms, we are missing out on documenting organizations, campaigns and elections around the globe. Much of this data exists as data sets based on aggregated use rather than individual files.

Often these are external proprietary platforms bound by intellectual property law and potentially privacy law which will impede the imminence of action. What recourse do archives or digital repositories have to deal with this and capture the materials?

 Case Studies or Examples:

  • A range of use cases are presented in Thomson, S. (2016) ‘Preserving Social Media’, DPC Technology Watch Report (16-02). Available at: http://doi.org/10.7207/twr16-02.

  • The National Library of Scotland ‘The Archive of Tomorrow: Health Information and Misinformation in the UK Web Archive’ project, to record the proliferation of misinformation about coronavirus. See Archive of Tomorrow (2022-2023), National Library of Scotland. Available at: https://www.nls.uk/about-us/working-with-others/archive-of-tomorrow/ [accessed 24 October 2023].

  • The archiving of the ‘In Her Shoes’ collection, part of the Archiving Reproductive Health (ARH) project. Working with key stakeholders, including activist organisations like Abortion Rights Campaign, Together for Yes, Terminations for Medical Reasons, Coalition to Repeal the Eighth, and many others, ARH gathered and preserved a selection of digital objects and research data, including social media, that tells part of the story of this historic campaign. ARH published collections of design and publicity material from activist groups, as well as a sequence of stories from the popular Facebook page ‘In Her Shoes’, a page where people anonymously shared stories of their experiences of being unable to access abortion in Ireland. This initiative received a 2022 Digital Preservation Award for Safeguarding the Digital Legacy. See Archiving Reproductive Health Project (2022), ‘Archiving Reproductive Health’, Digital Preservation Awards 2022. Available at: https://www.dpconline.org/events/digital-preservation-awards/dpa2022-archiving-reproductive-health [accessed 24 October 2023].

  • An example of a tool available to help libraries and archives with capture is Archive Social. See CIVICPLUS (n.d.) ‘ArchiveSocial’. Available at: https://archivesocial.com/ [accessed 24 October 2023].

See also

Read More

Always Online Games

Always Online Games

   Critically Endangered small

Video games that are required to be continuously online. Gameplay is referenced here particularly as means of participation, along with social media and in-game interaction between players. This can include Massively Multiplayer Online games and single player games with always-on DRM.

Digital Species: Gaming

Trend in 2022:

increased risk Towards even greater risk

Consensus Decision

Added to List: 2019 (rescoped 2023)

Trend in 2023:

increased risk Towards even greater risk

Previously: Critically Endangered

Imminence of Action

Action is recommended within three years, detailed assessment within one year.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

Loss seems likely: by the time tools or techniques have been developed the material will likely have been lost.

Examples

Fortnite, World of Warcraft, Neverwinter, League of Legends

‘Practically Extinct’ in the Presence of Aggravating Conditions

Controversies around IPR; lack of offline backup; changing business model of providers; limited recognition of value of game play; over dependence on goodwill of ad-hoc community; lack of preservation know-how at service providers; dependency on bespoke hardware or interfaces; increased reliance on always-on DRM for single player games.

‘Endangered’ in the Presence of Good Practice

Well documented code; IPR supportive of preservation; large and committed user community; removal of always-on DRM for single player games.

2023 Review

This entry was added in 2019 as a subset of the 2017 entry for “gaming”. The 2020 and 2021 Juries noted a trend towards greater risk, due to the increased significance of these games during the COVID Pandemic as well as the evolving nature of MMOs, to the extent that the 2021 Jury changed the risk classification from Endangered to Critically Endangered.

The 2022 Bit List Taskforce suggested that the 2023 Council consider the naming and scope of the entry. The 2023 Council agreed with this suggestion and rescoped this entry to Always Online Games covering all games that have to always be online, whether that is due to being MMOs, server-based games or single-player games with Always-Online DRM. Games that have online components but are not required to always be online fit into the new “Games with Online Play Components” entry.

Additional Comments

Preservation for Always Online games in a playable state requires preservation or re-creation of the servers that are used to run these games. Even then, for MMOs or multiplayer games, it would be impossible to recreate these games at their various peaks. This nicely encapsulates why video recordings of (online) gameplay are important. They will never have the same configuration of subscribers, to say nothing of the innumerable changes made to the software over the years, which have significantly altered how the game works and looks. Loss is inevitable, and it has already happened. The social and cultural aspects of play are incredibly important, and on-screen recording is the most robust way to capture that.

Whilst it is expected that MMOs and always multiplayer games (such as Fortnite) would always require an internet connection due to their reliance on servers, single player games, or those where the primary gameplay is single player, being always online due to DRM provides an added risk to preservation. If the server shuts down, then even the single player components might not be playable, thus loss happens faster than a single player game that does not have a reliance on servers. For more details, see the Shut Down or Discontinued Video Games entry.

Case Studies or Examples:

  • PCGamingWiki has an automated list of games that has Always-online DRM as well as a list of games that had Always-online DRM that have shut down. PCGamingWiki (n.d.) ‘List of games using Always Online DRM’. Available at: https://www.pcgamingwiki.com/wiki/List_of_games_using_Always_Online_DRM [accessed 24 October 2023].

  • GOG is a digital distribution platform for video games and films that only distributes games that are DRM-free. See GOG (2022), ‘GOG 2022 UPDATE #2: OUR COMMITMENT TO DRM-FREE GAMING’. Available at: https://www.gog.com/news/bgog_2022_update_2b_our_commitment_to_drmfree_gaming [accessed 24 October 2023].

  • Dr Megan Winget’s ethnographic research project focused on supporting the collection and preservation of massively multiplayer online (MMO) games. See Dr. Winget, M. (2009) ‘Winget (Megan) Videogame Development Research Collection’, The University of Texas at Austin. Available at: https://repositories.lib.utexas.edu/handle/2152/8465 [accessed 24 October 2023].

See also:

  • The British Film Institute's “Embracing a wider screen culture” strategy notes the cultural significance of video games and states that they intend to embark on sector research, engagement and knowledge exchange (including on the preservation of video games and digital media). See BFI (n.d.) ‘Embracing a wider screen culture’. Available at: https://blog.bfi.org.uk/long-read/our-ambitions/embracing-a-wider-screen-culture/ [accessed 24 October 2023].

Read More

Open Source Intelligence Sources of Current Conflicts

Open Source Intelligence Sources of Current Conflicts

   Critically Endangered small

Open source intelligence produced, collected and analysed from publicly, openly available social media and web content with the purpose of answering a specific intelligence question and that supports crowd-sourced investigation and fact-checking to verify or refute claims of state agencies and rebel groups in the context of current political or military conflict.

Digital Species: Legal Data

Trend in 2022:

increased risk Towards even greater risk

Consensus Decision

Added to List: 2019

Trend in 2023:

No change No Change

Previously: Critically Endangered

Imminence of Action

Action is recommended within twelve months, detailed assessment is a priority.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques.

Examples

Social media sources relating to current conflicts, such as in Yemen or Syria.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Loss of authenticity; lack of preservation agency; limited or no digital preservation capability.

‘Endangered’ in the Presence of Good Practice

Offline backup captured by the journalist or investigating authority.

2023 Review

This entry was added as a subset in 2019, as part of a broader ‘Open Source Intelligence Sources’ which the Jury split into three elements, relating to current, recent and historic sources. This entry relates in particular to materials relating to current and ongoing conflicts. Social media companies have a policy to take down or suppress content that they consider to be propaganda for terrorist groups. This has had the unintended consequence of deleting or suppressing content that was being used in open source investigation or fact-checking for journalistic or judicial purposes, and which may therefore be an impediment to refutation or prosecution. However, a new generation of cloud-based services, such as Hunchly, have emerged in the last few years, which allow investigators to copy and stabilize content to private accounts in the process of investigating it: so, the ethical requirements of social media companies and the integrity of the investigation are both served. The 2021 Jury noted that such content stays at risk, and the process of investigation is slower than algorithmic deletion. Nonetheless, there is a notable difference in the investigation of current conflicts than historic ones where evidence has been lost. The 2022 Taskforce identified a trend towards even greater risk based on the increased significance of crowd-sourced investigations and fact-checking in light of ongoing global conflicts that include (but are not limited to) those in Ukraine.

The 2023 Council agreed with the Critically Endangered classification with the overall risks remaining on the same basis as before (no change to the trend).

Additional Comments

The Council also added clarification to the meaning of ‘open source’ for this entry, to explain its meaning in relation to intelligence openly available online, noting that open source can also refer to a specific software or content licence that permits limited uses of IP so this distinction would be helpful for readers.

Preservation is important for social context and may be picked up inadvertently in other ways - but is ambiguous about who has ultimate responsibility for collecting and preserving this.

 

Case Studies or Examples:

  • The Ukraine Investigations by GLAN and Bellingcat Justice & Accountability project to investigate alleged atrocity crimes taking place in Ukraine. The aim of the project is to conduct a set of open source investigations into incidents causing civilian harm occurring in Ukraine according to robust legal standards with the aim of making them available to national and international prosecutors who are gathering evidence of alleged crimes. In this case, the open source content gathered during Bellingcat’s investigations will be preserved by Mnemonic, an independent third-party organization maintaining an archive of digital content from Ukraine, as it has done for Syria, Yemen and Sudan. See Glan and Bellingcat (n.d.), ‘Methodology for Online Open Source Investigations’. Available at: https://www.glanlaw.org/online-open-source-methodology [accessed 24 October 2023]

See also:

  • The website of the Forensic Architecture (FA) research agency, based at Goldsmiths, University of London, offers examples of OSINT. See Forensic Architecture (n.d.). Available at: https://forensic-architecture.org/methodology/osint [accessed 24 October 2023]

  • The website of the Coalition for Content Provenance and Authenticity (C2PA). The C2PA addresses the prevalence of misleading information online through the development of technical standards for certifying the source and history (or provenance) of media content. C2PA is a Joint Development Foundation project, formed through an alliance between Adobe, Arm, Intel, Microsoft and Truepic. See Coalition for Content Provenance and Authenticity (n.d.). Available at: https://c2pa.org/ [accessed 24 October 2023]

  • Baumhofer, E. and Reilly, B.F. (2022) ‘Preserving Open Source Digital Evidence: A Guide for Practitioners Working on Dealing with the Past’, Available at: https://www.swisspeace.ch/articles/preserving-open-source-digital-evidence [accessed 24 October 2023]

  • Higgins, E. (2019) ‘Bellingcat and beyond. The future for Bellingcat and online open source investigation’, iPres Conference 2019, Amsterdam. Available at: https://www.youtube.com/watch?v=kZAb7CVGmXM [accessed 24 October 2023]

  • Dubberley, S., and Ivens, G. (2022) ‘Outlining a Human-Rights Based Approach to Digital Open Source Investigations’, The Human Rights, Big Data and Technlogy Project. Available at: http://repository.essex.ac.uk/32642/1/Outlining%20a%20Human-Rights%20Based%20Approach%20to%20Digital%20Open%20Source%20Investigations.pdf [accessed 24 October 2023]

Read More

Politically Sensitive Data

Politically Sensitive Data

   Critically Endangered small

Digital content where the knowledge to preserve exists, and there is no threat to obsolescence, but where political interests may be served by elimination, falsification or concealment.

Digital Species: Political Data

Trend in 2022:

increased riskTowards even greater risk

Consensus Decision

Added to List: 2017

Trend in 2023:

No change No Change

Previously: Critically Endangered

Imminence of Action

Immediate action necessary. Where detected should be stabilized and reported as a matter of urgency.

Significance of Loss

The loss of tools, data or services within this group would impact on people and sectors around the world.

Effort to Preserve | Inevitability

It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques.

Examples

Online News; social media and web-based campaigning; social media relating to 2016 UK/EU referendum; Promises made in Scottish independence referendum 2014; US Environmental Data; UK Public Finance Initiative (PFI) documents; Recordings of Leinster House.

‘Practically Extinct’ in the Presence of Aggravating Conditions

Opaque terms and conditions that facilitate deletion or obfuscation; lack of access to web-harvesting; significant lobby interest; change of administration; data resides in single jurisdiction; reputational risk to collecting institution.

‘Endangered’ in the Presence of Good Practice

Robust political archives; robust preservation services for investigative journalists.

2023 Review

The nature and extent of political campaigning online continue to become more apparent. This has drawn attention to the manipulation of digital media but not explicitly the issue of deliberate deletion, alteration or concealment. GDPR provides a pretext for the disposal of records. The increased capability of archives to secure the content from outgoing governments and ministers is a source of encouragement, such as in Canada, accusations that the incoming Liberal government had wiped the memory of the outgoing Conservative government were shown to be unfounded. Nonetheless, there is a pressing need for a deep and comprehensive assessment of the risks faced by politically sensitive data and the impact which such deletions have on the public good. That another year should have passed without such an assessment is a matter of serious concern, leading to the 2020 trend towards increased risk, which the 2021 Jury agreed with the continuation of significant political and economic upheaval, in part because of the pandemic, but also because of popular protest and the outcomes of elections around the world. Moreover, they added how it had been widely reported that senior officials in government have avoided scrutiny and record-keeping laws by using self-deleting messaging applications. In these circumstances, politically sensitive records are likely to be at greater risk.

The 2022 Taskforce agreed on a trend towards even greater risk based on the increased significance of elimination, falsification or concealment in light of political upheaval, social and economic inequalities and climate change. The case of political upheaval and protest in Iran has further amplified the risks here. Anonymous digital art and social media activism have burgeoned in response to gendered violence and acts of political repression in the latter half of the year. However, preservation infrastructures, such as national libraries and collecting archives within universities are conflicted, therefore unlikely, unable or unwilling to preserve content that is explicitly and radically critical of the regime.

The 2023 Council agreed with the Critically Endangered classification with overall risks remaining on the same basis as before (no change to the trend). They also provided discussion and comments around GDPR abuses. GDPR can be abused for blocking access to public records and political data. The existence of “special category data” under GDPR is used to justify denying access even to people’s own data. These justifications usually do not reflect the reality of how GDPR works at all but it is used as a way to shut down these challenges.. 

Additional Comments

There is a question of whether it is the duty of archives/libraries to preserve the falsification but to instead preserve the constituent pieces to allow researchers to infer elimination, falsification or concealment.

See also:

  • World Wide Web Foundation, The Open Data Barometer, which provides a global measure of how governments are publishing and using open data for accountability, innovation and social impact, which looks at the 30 governments that have adopted the Open Data Charter and those that, as G20 members, have committed to G20 Anti-Corruption Open Data Principles. World Wide Web Foundation (n.d.) ‘The Open Data Barometer’. Available at: https://opendatabarometer.org/ [accessed 24 October 2023]

  • Ovenden, R., (2020) ‘Undelete our government’, Digital Preservation Coalition Blog. Available at: https://www.dpconline.org/blog/undelete-our-government [accessed 24 October 2023]

  • Mitcham, J. (2022) ‘What’s up with using WhatsApp?’, Digital Preservation Coalition Blog. Available at https://www.dpconline.org/blog/what-s-up-with-using-whatsapp [accessed 24 October 2023]

  • Example of data rescue work by the Environmental Data & Governance Initiative (EDGI), initially formed in November 2016 to document and analyze changes to environmental governance that would transpire under the Trump Administration. EDGI subsequently became the preeminent watchdog group for material on federal environmental data issues on government websites, and a national leader in highlighting President Trump’s impacts such as declines in EPA enforcement. See Environmental Data & Governance Initiative (n.d.) ‘Archiving Data’. Available at: https://envirodatagov.org/archiving/ [accessed 24 October 2023]

  • Johnston, L. and England, E. (2021) ‘A Framework Enabling the Preservation of Government Electronic Records’, Digital Preservation Coalition Blog. Available at: https://www.dpconline.org/blog/bit-list-blog/blog-nara-wdpd [accessed 24 October 2023]

Read More

Scroll to top