From Paper to Preservica: How the Simcoe County Archives is Migrating Metadata between Systems

Also in this section
Blog Topics

Latest Comments

14 Things I Loved and Learned at iPRES 2025
- Villy Magero 5 months ago
  
  I am so proud of your work Ruby. Keep going you are destined for good things in the profession!
Archiving Facebook, Right Now
- Helena 8 months ago
  
  Thanks for sharing this Andy, it is such a useful read
The Data Recovery of it all: iPRES 2025
- Norah 6 months ago
  
  Interesting to learn that like many solutions or innovations there are a lot of adjustments to be ...

DPC Blog RSS Feed

Also in this section

Olivia White

Last updated on 4 March 2026

Olivia White is Digital Preservation Archivist at Simcoe County Archives

The Simcoe County Archives in Minesing, Ontario, Canada has been collecting, preserving, and making available records pertaining to the history of Simcoe County since 1966. As the oldest County-level archives in Ontario, finding aids for our collections began as hand-typed documents at the Archives’ inception.

With the adoption of an in-house database using DBTextWorks in 1997, newly processed records could be described digitally. In 2008, the Archives launched a public-facing database using the Web Publisher Pro add-on to DBTextWorks to allow researchers to remotely search descriptions of our collections on our website. In 2025, we split our descriptions into two public-facing databases for our government and non-government records respectively. There were over 22,000 descriptions comprising 200 finding aids of non-government records created before the in-house database was implemented that needed to be migrated. The migration of our physical descriptions also includes the need to address our physical index cards, but this falls beyond the scope of the migration project discussed here.

We developed a workflow to migrate data from our printed finding aids into our non-government records database and subsequently into our Digital Preservation System, Preservica. Our goal is to ensure that all of our descriptions are accessible in both the database and Preservica, such that both archival staff and public researchers will be able to consistently search our collections across platforms.

Step 1: Digitize and transcribe the printed finding aids

From 2020 to 2021, all printed finding aids were digitized and transcribed into a Microsoft Excel workbook with separate sheets for each finding aid. The separate sheets of all the finding aids were merged into one large Excel workbook using the “Get & Transform Data” tool within Excel to aid in the crosswalking process.

Step 2: Crosswalking metadata

The physical finding aids contain metadata fields (11 in total) that are not consistent with the fields and formatting of the in-house database (61 in total). Accordingly, the data was cross-walked in compliance with the new database fields to be uploaded into the in-house database.*

Printed Finding Aid Metadata Fields	Database Metadata Fields
Acc. #	Accession number
RAD Date	Dates of creation
Start Date	Start date
End Date	End date
Description (Scope and Contents)	Scope and content
Stack Location	Location
Access Restrictions	Restrictions on access
See also	Related materials at Simcoe County Archives

*Table showcases the crosswalk of the printed finding aid metadata fields and the in-house database metadata fields.

There were three fields in the printed finding aids without a true equivalent in the in-house database; “Inventory of”, “Sub-category” and “Sub-Sub category.” These fields were important to keep when assigning subject headings and the fonds names but were intended to be deleted once the cleaning process was complete.

Step 3: Clean the metadata

To standardize the formatting of the metadata, I created a sub-workflow to outline the process of cleaning the descriptions to be uploaded into the non-government records database. Some of the printed finding aids were thematic and contained descriptions from various collections related to a local school or a church, for example. Thus, the large workbook was useful for identifying duplicate descriptions. Using the “Inventory of” and “Accession number” fields, fonds/collection names and numbers were assigned and used for filtering.

Reviewing one collection at a time, we are able to remove duplicates and review each field for consistency. While many collections have been successfully cleaned, the cleaning process for all our physical descriptions remains ongoing.

Step 4: Prepare descriptions to be uploaded into our database

Once a collection is ready to be uploaded, the descriptions can be copied into a Macro-enabled Excel spreadsheet. The spreadsheet was created using code generated by prompting an Artificial Intelligence assistance tool. The prompt was “Write a macro for excel that takes one spreadsheet and crosswalks the fields to another spreadsheet while removing the fields that are not needed.” This was our first time using Artificial Intelligence (AI) as a tool to aid our archival processes.

In consideration of the ongoing discussion of the impact of AI usage over time, we felt that the one-time generation of the code to be used throughout the cleaning process was a suitable small-scale test because it did not require us to input multiple prompts to meet our needs. The code was reviewed for accuracy before it was included in the workflow. The Macro-enabled spreadsheet has allowed us to save descriptions from the large workbook into a separate spreadsheet with the proper in-house metadata fields and automatically delete unnecessary fields. This automation has reduced the workload of manually copying these entries and deleting columns, with archival staff now only being required to check the converted files for errors.

Step 5: Upload the descriptions into our database

The cleaned descriptions, with the desired metadata fields, are saved as a Comma Separated Values (CSV) document. At this point, the CSV files are compatible with our existing sub-workflow that outlines how to upload CSV files into our database. This sub-workflow involves steps such as indicating during the uploading process that the pipe key (|) is used as an entry separator and that the first row contains field names. In this manner, descriptions are being incrementally uploaded into our non-government records database.

Step 6: Export descriptions for upload into Preservica

Once descriptions are properly uploaded into the database, they can also be exported as a CSV together with any other relevant descriptions. Preservica has recently enabled descriptive metadata to be assigned automatically to digital records by using a CSV during the uploading process.

By adding the digital file or folder title in the first column of the CSV, Preservica will assign the descriptive metadata automatically to the intended files and folders upon upload. This is more efficient than manually adding descriptions for each digital file or folder.

It is important to note that the semi-colon is used as an entry separator in Preservica. During this stage of our workflow, semi-colons must replace the pipe key, which is used as an entry separator in our databases. Grammatical semi-colons also must be changed because the Preservica system will try to separate the text into different entries, which may not be permitted by the rules of the metadata field, resulting in an error.

By using descriptions first uploaded into our non-government records database, we ensure that the descriptive metadata used in our soon-to-be public database for our digital records remains consistent between both systems.

Conclusion

We intend to utilize our workflow to continually migrate data from our printed finding aids into our non-government records database, to ensure that researchers can remotely search all our described collections and records. Since November 2025, 3002 descriptions have been uploaded into our non-government records database. We also intend for our researchers to be able to search for digital records in a similar manner, once our public-facing database within Preservica is available. We are interested in learning from other institutions who have developed solutions for migrating data between different systems.

Add comment

14 Things I Loved and Learned at iPRES 2025

Archiving Facebook, Right Now

The Data Recovery of it all: iPRES 2025

From Paper to Preservica: How the Simcoe County Archives is Migrating Metadata between Systems

Olivia White