The EaaSI Interaction API

Also in this section
Blog Topics

Latest Comments

A perspective on need among digital preservation professionals
- Micky Lindlar 3 months ago
  
  Hi James! Great work - thanks for conducting it and raising awareness of it through this blog. I'm ...
An Unexpected Gift
- Niamh Murphy 7 months ago
  
  This is fantastic! Thank you so much, Andy! Merry Christmas!
Workflows At The University Of Sheffield: Showcasing The Work Of The Last Year
- cate 1 week ago
  
  I love this! Would it be possible to see a higher res image of your workflow illustration? I think ...

DPC Blog RSS Feed

Also in this section

Euan Cochrane

Last updated on 4 November 2020

Euan Cochrane is the Digital Preservation Manager at Yale University Library in New Haven, CT

Imagine being able to migrate any data from any legacy format to any compatible modern format automatically.

Imagine being guided through using legacy software with real-time demonstrations and tutorials that you can interrupt and take over from at any point.

Imagine being able to add a screen reader to any software, enhancing accessibility.

Image having access to comprehensive metadata about all software titles.

We are developing Emulation as a Service Infrastructure (EaaSI) software with a long time horizon - we expect the general approach that EaaSI is enabling, i.e. the ability to be able to re-run legacy software at any point in time, to be necessary indefinitely.

Additionally we are developing EaaSI with the intent that it provides generic, fundamental preservation and access infrastructure. We hope that others will build on this in the future and create new implementations and tools on this infrastructure and are working to encourage that behaviour.

Given this context, I’m excited to share some news about features we’re adding to EaaSI that I think have the potential to open new opportunities for simplifying digital preservation and increasing the discoverability, accessibility, and usability of legacy software. Importantly, these new features will in turn enable increased access to the digital content that depends on this legacy software for long term access. I hope that the ideas outlined below inspire readers to try EaaSI.

In the second phase of the EaaSI program of work we are revisiting research the bwFLA team completed to enable migration of content between file formats using original software running in emulation. We’re building on this work by first re-implementing it and then by adding to it.

Essentially we’re adding the following features to the EaaSI software:

1. The ability to record mouse and keyboard inputs while using an environment
2. The ability to re-run recordings of interactions
3. The ability to programmatically run arbitrary mouse and keyboard commands in emulated environments
4. The ability to stream video/audio outputs to an analysis engine in real-time, or to a file for recording and later analysis

This opens up a world of new opportunities that I believe I’m just scratching the surface of here but to list a few based on the introductory musings above:

1. The bwFLA team previously prototyped applying this approach to enable migration via emulation. This involves recording the process to open a file in an emulated environment and save it as a new format then shutting down the environment. That recording can then be re-run across many different files automatically. We aim to build on this to optionally also enable the approach used in the Universal Virtual Interactor which uses the environments’ native scripting facilities to initiate opening digital objects in the requested software. In the future this approach could lead to a huge explosion of migration pathways being added to the Preservation Actions Registries as every office application could become a tool that could add many different migration pathways/actions. Furthermore these migration pathways could be chained together so that you might migrate content from a file with a very old format to a file with a newer format, to a file with a newer format, and so-on, using emulation.

Economically migration by emulation is disruptive to the traditional “just-in-case someone wants to access it immediately” migration or normalization digital preservation model. In this traditional model the original files are usually kept alongside the new files created through migration in case they are needed for any reason. However I have been told that while archives are essential when they are needed, the vast majority of archived content is rarely or never accessed, meaning the work and energy used to migrate content, along with the storage costs to keep the original in addition to the migrated version, are wasted. But under the traditional model many have assumed we have to migrate everything now as the migration tools may not work in the future, or just in case a user needs something immediately in a form their contemporary software can interpret. Migration by emulation undermines the first of these assumptions (and immediate access in emulation undermines the second). If we are keeping emulated environments running forever anyway (to verify the content-integrity of digital objects, to provide the experience of the Digital Patina, and to ensure the most original experience of the digital content) then there is no additional cost to keep migration tools available forever when the migration tools are the emulated environments. If we are keeping the migration tools around anyway and can run them on demand, then why waste the resources to preemptively, or “just-in-case” migrate/normalize content to files in newer formats? Why not just migrate as needed and retain the results of the migration temporarily while access usage is relatively high?

2. We could record opening software and navigating to the open-as or save-as dialogue box and then capture and OCR the screen. From there (perhaps with some machine learning coupled with initial manual training) we ought to be able to identify the portion that is showing the formats that the software can open or save. This could dramatically improve our metadata creation speed for metadata about file formats that software is compatible with.

3. TrIDScan is a tool that will generate file format signatures from a folder of files that are known to be of the same format. We could generate data to feed TrIDScan using the API. This could be achieved by creating a number of files filled with many types of content (images, equations, formulae, etc), opening the files automatically in known software, and saving them automatically in a specific format. If we use many different types of sample files in one application to create new files all with the same format we may be able identify a signature for that specific format as created by that software. Similarly, if we use many different applications to create files that purport to be of the same format (according to the creating software) we could create general signatures for those formats by feeding TrIDScan all of those files.

4. We could have a set of known-format files that we attempt to automatically open in every application that might possibly open them. We could then screenshot the results and analyse those screenshots (manually or with some machine help) to identify if they opened successfully. This could be used to automatically verify the “compatibility” of legacy software applications with specific file formats.

5. By recording many applications and manually marking-up different buttons, menus, and workflows we may be able to train a machine learning/Artificial Intelligence (AI) tool to automatically identify these in the future. This would in turn allow a more user-friendly interaction API to be created that allowed for specifying specific interactions instead of mouse positions and clicks and keyboard presses. This can then be taken a step further and buttons could be created to be placed alongside environments embedded in web pages in order to enable users to undertake specific interactions with an environment without needing to know how to navigate the interface that may be quite foreign to them. For example there may be a button included for AutoCAD for MS-DOS to enable a user to “zoom in one level” (something that is otherwise quite tricky to understand how to do).

6. Building on the machine learning approach for identifying software components from point 5. above, software applications generally have fonts used in them that are amenable to automated Optical Character Recognition (OCR). It should be possible to OCR screenshots of the interfaces of preserved software applications and use the machine learning approach to identify the components to which the labels apply. This would allow us to create a screen reader for any software. This could immensely increase the accessibility of both legacy software and the content it can be used to access.

7. Being able to record and replay interactions will enable new forms of software documentation to be created. It should be possible to create an interface to emulated environments that allow for replaying a set of actions and interrupting the replay to take over at a particular point. This may be particularly useful for explaining how legacy software interfaces worked and for providing enhanced commentary on video games and interactive digital art.

8. If we can ethically record user-interactions and analyse them we may be able to identify patterns in them. One application of this approach might be to create buttons for commonly executed interactions.

9. Distant reading is “an approach in literary studies that applies computational methods to literary data, usually derived from large digital libraries, for the purposes of literary history and theory.” It ought to be possible to use the interaction and recording APIs to undertake distant reading of whole computing environments or of software applications. A first thought for a possible application of this might be for undertaking a comparative analysis of a single software title to understand how it and its feature-set evolved over time. One might also analyse how multiple competing products evolved over time. Presumably there are at very least many applications of this approach for historical analysis and potentially for code analysis, for example for comparing code performance over versions using a known (emulated) hardware set.

So while I hopefully have your interest, I must temper it a little with some realism about the potential issues that may get in the way of creating these promising tools and services using the EaaSI Interaction API:

1. This general approach has been implemented once previously by the bwFLA team and it worked well in limited circumstances however reimplementing it with the current code-base and ensuring it works in general will not be trivial.
2. Software can behave unpredictably. For example there can be random pop-ups that appear along with system notifications and undismissable animations. All of which can make automating actions in software difficult to achieve reliably.
3. Interacting with different content in the same software can introduce additional unpredictability, particularly related to the time to undertake particular actions. This makes automating those actions difficult. You can overcome a lot of these problems by introducing long-enough time delays between steps in your actions. However time-delays may be computationally (and energy-usage-wise) costly. If this becomes the case we may need to instead invest in better real-time analysis in order to react as soon as events complete (such as when a file has opened, or a save-as process has completed) by using machine learning to create models for automatically identifying these events.
4. Any recording of user-interactions for later, or real-time analysis, would require consent from the users which could hinder the speed at which we can apply machine learning approaches to improve the value of applications of the API. In my opinion gaining consent is essential despite limitations that may introduce but it will slow down progress.

So hopefully this paints a picture of a small part the what’s to come with the EaaSI program of work. From increasing the accessibility of digital archives, software and emulated environments, to saving energy and time through enabling a new approach to long-term preservation and access, to improving software metadata, and to enabling “distant reading” of entire computers, the new possibilities the team are enabling with EaaSI are exciting to me and hopefully valuable and exciting to you all.

We’re aiming to roll out these features sometime next calendar year or possibly early 2022 (I know better than to state a firm deadline at this stage). In the meantime if any of this is of interest please do get in contact at eaasi@yale.edu. We’d love to hear your feedback, whether you would like to be involved in creating some of these potential tools and services using the new EaaSI Interaction API, or if you have any additional ideas for how it might be used.

Add comment

A perspective on need among digital preservation professionals

An Unexpected Gift

Workflows At The University Of Sheffield: Showcasing The Work Of The Last Year

Euan Cochrane