The table below lists some of the metadata fields that you may wish to capture from an EDRMS or other record keeping system when migrating records of long term value to a preservation system.

Recognising that different organisations may require different fields depending on their context and the anticipated future users and use cases of the records, a set of metadata fields are listed with some description and notes and a list of reasons why it might be important to capture in particular contexts. 

It should be noted that not all record keeping systems will capture and store all of the metadata fields described below. Many of the fields may be commonly found in EDRMS, but perhaps not other less controlled systems in which records are stored. 

Decisions on which metadata to capture will need to factor in the following considerations: 

  • Does the record keeping system store this information?

  • Can this information be extracted from the record keeping system?

  • Can this information be stored within the digital archive?

Record level metadata

You may wish to capture the following metadata at record level: 

Metadata field

Definition

Notes

Why you might need this

File name

The file name of the record as stored in the record keeping system

Note that the system may allow duplicate file names or may allow file names to include special characters that may cause problems once the files are exported into a file system (e.g. \/:?*”<>|). If this is the case, files may be renamed on export and it is important to ensure that the metadata includes details of the original file name of the object as stored in the system.

You should consider capturing this information in the following circumstances:

  • If this information is required to relate the exported digital records to their metadata.

  • If the file name contains valuable information which will help current and future users understand the file.

  • If current and future users will use the file name to refer to records or for search and retrieval.

  • If file names may be changed on export (for example to remove special characters or resolve duplicates).

File format

The file format of the record as defined in the system

An EDRMS or other system may record the file format of each record. This may not be as thorough or accurate as the file format identification that you would wish to carry out within a digital archive (for example it may state the file is a PDF but not which version). It seems likely that file format identification would be carried out outside of the system, either as a pre-ingest step or as a part of the ingest process as records move into the digital archive.

You should consider capturing this information in the following circumstances:

  • If it is considered necessary to understand how the digital objects were identified within the original system.

  • If you have confidence in the file format metadata from the system.

  • If this information will provide a useful overview prior to more thorough file identification outside of the system.

Previous file format or file extension

The previous file format or extension of a record

In certain circumstances, a record keeping system may change the format of a file on capture or upload. An example that has been noted is the conversion of emails to a format specific to the EDRMS in which they are stored. If a conversion such as this has occurred, there may be evidence of this within the metadata.

You should consider capturing this information in the following circumstances:

  • If this behaviour (file format migration) has been noted within the record keeping system.

  • If being able to demonstrate provenance and history of the record is a priority.

MIME type

The MIME type of the record as defined in the system

The system may store information about the MIME type of each record, but also is typically captured as part of pre-ingest or ingest routines within a digital archive.

You should consider capturing this information in the following circumstances:

  • If it is considered necessary to understand how the digital objects were defined within the system.

  • If you have confidence in the MIME type metadata from the system.

  • If this information will provide a useful overview prior to more thorough file identification outside of the system.

File size

The size of the record (in KB/MB as appropriate)

The record keeping system may store information about the file size of each record, but this information is also typically captured as part of pre-ingest or ingest routines within a digital archive.

You should consider capturing this information in the following circumstances:

  • If this information will be useful to you for planning or validation purposes as you ingest the content into the digital archive.

Number of files

The number of files that make up a single record within the system. For example this may apply to the contents of ZIP file, emails with attachments or number of messages within a PST file.

This metric will only apply to certain records. Note that metadata about number of files in total within a transfer or export is discussed under ‘Transfer level metadata’.

You should consider capturing this information in the following circumstances:

  • If the system contains complex records consisting of multiple files.

  • If this information will be useful to you for planning or validation purposes as you ingest the content into the digital archive.

Digital object specific dimensions 

This metadata would be specific to particular types of digital object and could include:

  • Duration of audio-visual records

  • Word and page count of text

  • Dimensions of images

The system may contain metadata relating to the dimensions of digital objects and this will be specific to the types of records contained within it.

You should consider capturing this information in the following circumstances:

  • If the metadata contained within the system is of a higher standard of completion or accuracy than that which will be generated in the digital archive.

  • If this information will be useful to you for validation purposes.

Language

Language of the digital object.

 

You should consider capturing this information in the following circumstances:

  • If the records are in multiple languages.

  • If it will help future users of the records use and interpret them.

  • If it will help curators of the records describe and manage them.

Character encoding

Character encoding of the digital object. For example ASCII, Unicode, UTF-8.

 

You should consider capturing this information in the following circumstances:

  • If it will help future users of the records use and interpret them.

  • If it will help curators of the records describe and manage them.

Unique identifier (system generated)

The unique reference of a record within originating system (typically assigned automatically)

Note that there may be more than one version of this identifier that can be captured. The identifier may reflect the function, context or structure of the record and how it was used.

You should consider capturing this information in the following circumstances:

  • If this information will be useful to you for validation purposes.

  • If users or curators of the digital object need to cross reference it with the original record within the system.

  • If current and future users will use this identifier to refer to records or for search and retrieval.

  • If the assigned identifier gives a useful indication of the function and/or context of the record and how it was used.

Agency assigned identifier

Catalogue or local identifier of the record within the system (typically assigned by a human operator)

May reflect the function/context of the object and how it was used.

You should consider capturing this information in the following circumstances:

  • If this information will be useful to you for validation purposes.

  • If users or curators of the digital object need to cross reference it with the original record within the system.

  • If current and future users will use this identifier to refer to records or for search and retrieval.

  • If the assigned identifier gives a useful indication of the function and/or context of the record and how it was used.

Previous identifier

A previous identifier allocated to a record

A previous identifier metadata field may be of value where records have previously been migrated from another system. The previous identifier field may be particularly important If relationships between documents are defined using these identifiers.

You should consider capturing this information in the following circumstances:

  • If this identifier is used to define relationships between records.

  • If current and future users will use this identifier for search and retrieval.

  • If the identifier gives a useful indication of the function and/or context of the record and how it was used.

  • If being able to demonstrate provenance and history of the record is a priority.

Title

Title or short description of the record

Sometimes records may not have meaningful titles assigned, or a set of records will share a very generic title. In some cases a short description field may be present instead of a title.

You should consider capturing this information in the following circumstances:

  • If assigned titles are useful for search and retrieval of the records within the archive.

  • If assigned titles will help users understand the context or content of the records.

Description

More detailed description of the digital object

 

You should consider capturing this information in the following circumstances:

  • If assigned descriptions is useful for search and retrieval of the records within the archive.

  • If assigned descriptions will help users understand the context or content of the records.

Export date

Date record was exported from the system

This date does not exist within the system but may be included as part of an export or transfer process. Can help demonstrate provenance. May also help with disaster recovery.

You should consider capturing this information in the following circumstances:

  • If being able to demonstrate provenance and history of the record is a priority.

  • If this information will be valuable to you for disaster recovery purposes.

  • If further versions of the record may be transferred at a later date.

Creation date

Date record was originally created

Note that this date may still be attached to the files as system info once the record is extracted, but system dates are vulnerable to change so extracting this date as metadata is a sensible precaution. Note that this date may reflect the date a record was originally uploaded to the system rather than the original creation date.

You should consider capturing this information in the following circumstances:

  • If being able to demonstrate provenance and history of the record is a priority

  • If being able to demonstrate authenticity of the record is a priority

Last modified date

Date record was last modified

Note that this date may still be attached to the files as system info once the record is extracted, but system dates are vulnerable to change so extracting this date as metadata is a sensible precaution. The system may be configured to capture a full audit trail, including dates of all edits to a record. Consider what level of detail is required for the digital archive.

You should consider capturing this information in the following circumstances:

  • If being able to demonstrate provenance and history of the record is a priority.

  • If being able to demonstrate authenticity of the record is a priority.

Date folder was closed

The date an folder was closed may act as a trigger date for export to digital archive.

Depending on local practices, this action may be manually applied or automatically generated.

You should consider capturing this information in the following circumstances:

  • If this date is considered to have significance to the history of the record.

Review date

If a record is closed to the public this is the date it needs to be reviewed to see If it can be opened (unless a date open is already recorded - see below).

Note that this may be more broadly categorised as date of next action (where other proposed actions relating to a record are recorded)

You should consider capturing this information in the following circumstances:

  • If this information isn’t overridden by new processes and procedures applied in the archive.

Date open to public

The date a record can be (or was) opened for public access.

 

You should consider capturing this information in the following circumstances:

  • If this information isn’t overridden by new processes and procedures applied in the archive.

Date that the file became a record

The date that a file is marked as a record.

This may be a feature of some EDRMS and will depend on local practices. Depending how the field is used in practice, it may not be particularly meaningful. For example sometimes a file may be marked as a record years after the record was created and/or last edited.

You should consider capturing this information in the following circumstances:

  • If being able to demonstrate provenance and history of the record is a priority.

  • If being able to demonstrate authenticity of the record is a priority.

Disposal date

Date that record can be disposed of.

This field will not be applicable to all organisations and implementations, but in some cases records transferred to archive will need to be disposed of at a later date. 

You should consider capturing this information in the following circumstances:

  • If this information isn’t overridden by new processes and procedures applied in the archive.

Creator

Individual or group primarily responsible for creating the record

There may be more than one - depending on context you may want to record more granular roles. Note that there may be issues relating how to this is configured within the system (for example just as an identifier, which would need additional information to interpret). Important to ensure you get the details you need. Note also that there may be inaccuracies within the metadata. Systems and local practices will vary, but ensure you understand how it was generated. Is it added manually, extracted from the embedded metadata of a document or does the system generate it based on who uploaded the record (which may be different to who created the document)?

You should consider capturing this information in the following circumstances:

  • If being able to demonstrate provenance and history of the record is a priority.

  • If being able to demonstrate authenticity of the record is a priority.

  • If this information will help with search and retrieval of records within the archive.

  • If this information will help users understand and evaluate the record.

Creating organization

Details of organization responsible for creating record

As above there may be issues relating how to this is configured in the system (for example just as an identifier, which would need additional information to interpret). It is important to ensure you get the details you need. Note also that there may be inaccuracies within the metadata. Systems and local practices will vary, but ensure you understand how it was generated. Is it added manually or does the system generate it based on who uploaded the record (which may be different to who created the document)?

You should consider capturing this information in the following circumstances:

  • If being able to demonstrate provenance and history of the record is a priority.

  • If being able to demonstrate authenticity of the record is a priority.

  • If this information will help with search and retrieval of records within the archive.

  • If this information will help users understand and evaluate the record.

Edited by

Information about who has edited the record since creation

Record keeping systems may capture a full audit trail for a record, including details of any edits made. You may want to capture this information alongside edit dates (described above under ‘last modified date’)

You should consider capturing this information in the following circumstances:

  • If being able to demonstrate provenance and history of the record is a priority.

  • If being able to demonstrate authenticity of the record is a priority.

  • If this information will help with search and retrieval of records within the archive.

  • If this information will help users understand and evaluate the record.

Classification code

Classification code

Also relevant is the record identifier (discussed earlier)

You should consider capturing this information in the following circumstances:

  • If this information will help with search and retrieval of records within the archive.

  • If this information will help users understand and evaluate the record.

Classification

Human readable description of above code 

The classification code may consist of a series of acronyms which are hard for a user to interpret. The system may also store a more human readable description of this code

You should consider capturing this information in the following circumstances:

  • If this information will help with search and retrieval of records within the archive.

  • If this information will help users understand and evaluate the record.

Permissions

Who has rights to read/copy/edit a record within the system

May be applicable in some circumstances - depends on the context. Can cover a variety of things.

You should consider capturing this information in the following circumstances:

  • If this information isn’t overridden by new processes and procedures applied in the archive.

  • If this information is useful to future users of the record.

IPR and holder

Including copyrights

 

You should consider capturing this information in the following circumstances:

  • If this information isn’t overridden by new processes and procedures applied in the archive.

Checksum

Checksum for the record

Alongside the checksum itself it may also be helpful to extract details of the date the checksum was generated and the algorithm used. Note that if a batch of records have been imported into an EDRMS or other system they may have come with a checksum. It would also be useful to capture this information about previous checksum If it is present.

You should consider capturing this information in the following circumstances:

  • If being able to demonstrate authenticity of the record is a priority.

  • If this information is needed to validate that transfer has been successful.

Versioning

The version of the record

Multiple versions of any one record may exist within the system

You should consider capturing this information in the following circumstances:

  • If you intend to capture more than one version of a record (either now or in the future).

  • If this information is of value to you and future users of the record.

Location within folder structure or hierarchy

Records within an EDRMS or other record keeping system may be placed in a particular structure/hierarchy or ‘tagged’.

Where a record sits within a structure can give valuable context to a record. It may not make sense once it is moved out of this structure. The location of the record within the structure should be captured in some way, this may or may not be through the metadata export.

You should consider capturing this information in the following circumstances:

  • If this information isn’t captured within the export in another way (for example in the exported folder structure).

  • If the location of the record gives useful context that helps users locate, understand and interpret the record.

Relationships with other records

Relationships with other records (not apparent through the folder structure or hierarchy)

Relationships with other records within the system may be present in other ways outside of the relationships described through a record hierarchy or folder structure within the system. For example an email record may contain an attachment or multiple files may form a single logical record (for example a GIS layer or a website)

You should consider capturing this information in the following circumstances:

  • If this information isn’t captured within the export in another way (for example by any associated records being exported in the same folder or zip file as the record).

Other descriptive metadata

Other descriptive metadata that exists within the system

Local practices will dictate what additional descriptive metadata is contained within any record keeping system and this will typically be used to help current users with locating and interpreting the records.

You should consider capturing this information in the following circumstances:

  • If this information is helpful to current and future users of the digital archive for locating or interpreting records.

 

Transfer level metadata

You may wish to capture the following metadata at for the batch of records as a whole (rather than at record level):

Metadata field

Definition

Notes

Why you might need this

Total number  and total size of files/records

The number of and size of files and/or records extracted from the system

Totals for records and files may be different (for example one record may consist of multiple files) so two different figures may need to be captured here.

You should consider capturing this information in the following circumstances:

  • If this information will be useful to you for planning or validation purposes as you ingest the content into the digital archive.

System details

Details of the system that the records are being transferred from (for example name and version)

This information may need to be captured manually and incorporated into the metadata for each record. Additional documentation may also be required (see below)

You should consider capturing this information in the following circumstances:

  • If being able to demonstrate provenance and history of the record is a priority

 

Additional documentation

Some organizations will also wish to capture a full set of system documentation relating to the record keeping system and how it was configured and used. This may include a data dictionary, records management policy and procedure, users manual and documentation relating to the configuration or set up of the system. This level of documentation will provide an additional level of detail about the system and provide context for the records that are being preserved.


Scroll to top