Today's research as well as significant parts of cultural activity are conducted using electronic equipment, most notable computer systems and to some extent mobile electronic gadgets. Even more importantly, many born-digital assets are inextricably linked with their creation environment, which means that they require a specific runtime environment to be rendered and to restore their original utility, look-and-feel and performance. These runtime environments, however, are prone to short technological life-cycles and trends, typically tightly coupled to specific hardware and software platforms. Hence, scientific- and memory- institutions (like libraries, museums and archives) as well as companies and increasingly also individuals face the problem of managing their born-digital assets with a long-term perspective in mind.
To address this problem the bwFLA project has developed a distributed, scalable and cost-effective Cloud-based Emulation-as-a-Service (EaaS) preservation framework, enabling a wide range of non-technical users to access emulation technology to preserve, re-enact and present born-digital assets. in particular bwFLA project had three outcomes: an (1) EaaS API and necessary technical components, (2) a distributed data management model and its implementation and (3) workflows for ingest and access of digital assets as well as for maintaining emulated environments.
The basic building blocks of EaaS are emulation components (EC) representing deployable units that abstract the technological complexity of emulators. Technically, an EC encapsulates an emulator to an abstract software interface with a unified set of functionality such that emulators not only can be controlled and accessed in the same way, but different emulators also become interoperable, i.e. emulators of different vendors could be combined into a larger network compound. ln particular, the EC AP! provides interactive access to the emulated environment (currently web-based HTML5, RDP and VNC), networking (between emulated machines and internet access) and means to attach various data sources and virtual media (floppy, CD-ROM, HDDs). Currently emulation components covering all major past and present operation systems and technical platforms are available.
Typically, emulation services not used for long-running tasks or are requested in reoccurring and predictable patterns. Furthermore, depending on the re-enacted system, demands on computing resources can vary. Due to the unpredictable demand of such a service, a flexible and scalable deployment model for allocation of computational resources has been implemented, i.e. emulation components are allocated and deployed only when needed. Currently, local computer-cluster deployments (e.g. MAAS) and major Cloud infrastructure providers are supported (through 0penStack and Juju). More technical details can be found here [1].
Providing emulation services to access preserved and archived digital objects poses further challenges to data management. Digital artefacts are usually stored and maintained in dedicated repositories and object owners want to or are required to stay in control over their intellectual property. Hence, digital rights-management, and in some cases also privacy aspects, conflict with improved accessibility. instances of rendering environments, i.e. an installed and configured OS plus software stack on a virtual disk image, may reach up to hundreds of gigabytes in file size. Even with currently available network bandwidth, copying a fuil environment to an EaaS Cloud service is inefficient and impairs the user experience. In addition, users may need to change, customize or personalize environments, requiring user modifications to be tracked and stored (efficiently) for subsequent usage.
Therefore, bwFLA has developed a distributed storage and data access interface that (a) ensures that the user stays in control over his digital objects by storing modifications at the user's storage-site, (b) provides efficient data transport even with limited bandwidth and (c) supports efficient management of user modifications. A more detailed technical description and use-cases can be found here.
A distributed Cloud environment like EaaS that builds on the principles of division of labour requires an automated orchestration model to support users in re-enacting and managing their customized EaaS instances. An EaaS instance should be replicable deterministically, i.e. run the same configuration at a later point in time again without configuring all components again. To deal with these problems and to reliably re-enact instances, a technical description of an emulation environment has been developed [1].
This type of technical meta-data is designed to be portable and independent of a specific EC implementation and furthermore, define attached data-sources such as a system environment to be run and potentially an object to be rendered. This description also enables the creation of persistent identifier for a complex setup, allowing such an environment to be cited (e.g. http://hdl.handle.net/11270/767f2c0b-cce6-4623-8caf-f5a890afcb75) and en-acted on demand. Presentation of such a ready-made environment (i.e. runtime environment and object to be rendered) can be as easy as linking to or embedding a Youtube video. Using HTML5 technology for rendering, a modern browser (Chrome and Firefox are fully supported, Safari lacks OGG audio codec support) is sufficient for users to access and to interact with a published object. The environment can be embedded into a web page (e.g. http://bw-fla.uni-freiburg.de/demo-flusser.html) or blog and can be easily share using today's social media channels.
Workflows for object preparation are a further major part of the framework, e.g. to create or to modify system environments and to link a digital object to a specific rendering environment. While preparing a rendering environment through a guided process is optional, it results in technical meta-data with an exact description of the environment‘s view-path and its configuration. This information can then be (re-)used to classify and index rendering environments, so that they provide a base for other objects and a starting point for the creation of new derivatives. Furthermore, a rendering environment has to be evaluated with respect to a specific digital object. if the result is satisfactory, meta-data is produced that links the digital object to the technical description of the rendering environment and, additionally, describing potential shortcomings and other content-related performance observations.
Examples of workflows from different domains can be found here and tested at the bwFLA-demo instance: https://demo.bw~fla.uni-freiburg.de. To login: username “bwfla”, password: "demo".
[1] Thomas Liebetraut, Klaus Rechert, lsgandar Volizada, Konrad Meier and Dirk von Suchodoletz: Emulation-as-a-Service — The Past in the Cloud, In proceedings of the 7th IEEE International Conference on Cloud Computing, IEEE.