Vera Ferreira is Depositing Officer for Endangered Languages Documentation Programme, BBAW in Lisbon, Portugal and Leonore Lukschy is Programme Administrator & Communications Officer for Organisation: Endangered Languages Documentation Programme, BBAW in Berlin, Germany

ThaiArchive 1

Homepage of the Archive of Languages and Cultures of Ethnic Groups of Thailand (

 Showcasing the Archive of Languages and Cultures of Ethnic Groups of Thailand

Since the digital revolution many people have become active documenting endangered languages and traditions tied to them, making recordings on their phones, on audio recorders and video cameras. If lucky, these materials do not end up on hard drives, laptops or CDs in private possession but the creators aim to make them available on the web to preserve them for posterity. Materials are uploaded to a variety of platforms, in some cases websites created for this specific purpose, in others to commercial social media platforms. Websites specifically created for the dissemination of language recordings need to be maintained and funded. If the person or group in charge of hosting and maintaining a website no longer has time, interest, or runs out of funds, the website and materials on it may be taken offline. Commercial platforms are problematic because it is at the discretion of a private company whether or not the materials stay online. Neither individual websites nor social media platforms have standardised metadata which means that even if these materials are online, they are not necessarily discoverable. And even if they are discoverable, there is no long-term preservation infrastructure. If digital materials are not migrated to newer formats, they will not be accessible in the future, which makes digital files extremely volatile.

This is worrisome because many of these recordings are invaluable and may be the only recording of a ritual, of an elder, the holder of special knowledge, the shaman, or the singer of songs no one else remembers. Without these materials being professionally archived and preserved long-term, humanity’s intangible heritage is at stake of being lost.

Against this backdrop, the project Archive of Languages and Cultures of Ethnic Groups of Thailand (http:// presents a bottom-up participatory approach for archive creation aimed at combining professional archiving and long-term preservation with enabling communities to disseminate their cultural heritage online.

The Archive of Languages and Cultures of Ethnic Groups of Thailand came to fruition through a collaboration between the Endangered Languages Archive (ELAR) and the Research Institute for Languages and Cultures of Asia at Mahidol University. The project was supported by the Newton Fund, with the aim to create a digital platform for the preservation and dissemination of indigenous linguistic materials and cultural heritage in Thailand. The richness of publicly unknown data collected over the years in Thailand, the activism that characterises the attitude of several language community members and scholars in the country, associated with the lack of a digital archive for language materials, led us to develop a community-oriented approach to archiving and to select Mukurtu as the digital platform. Mukurtu is a CMS, which enforces archiving best practices (like metadata consistency, file format unification, access granularity), and lays the ground for professional archiving. It is fully customisable (also in terms of language interface - the Mukurtu instance in this project was fully localised to Thai), simple to use and less academia-oriented. The resources (audio, video, pictures, texts), the languages and the speaker communities are in the foreground – which is an important feature to catch the attention of a broader audience and thus increase the usability of the archived materials.

Mukurtu was combined with a working and backup server, to guarantee the preservation of original primary data and the necessary format migrations.

However, this is only the first step towards sustainable digital archiving. As mentioned before, while Mukurtu enforces basic archiving workflows, it is merely a CMS rendering a presentation layer. Throughout the project, the users inputting data into the Mukurtu platform became aware of the importance of rich and standardised metadata, format consistency and format migration. Due to the clear workflows and basic archiving principles implemented, the shift to or the combination of Mukurtu with a professional archiving system with an automated preservation layer will be much easier in the future.

Platforms such as Mukurtu offer an opportunity to break with the tradition of an extractivist North-South relationship, where data is kept securely in western academic institutions, while the rich materials compiled by language community members and activists in the Global South are not preserved and made accessible locally. Having a platform which can be easily localised, like Mukurtu, is already an essential step to making the materials discoverable by and accessible to their own authors and creators, strengthening the relationship between archives and their users.

In terms of community archiving, the ideal scenario would be the combination of the functionalities offered by Mukurtu with an automated preservation layer or with the additional storage of the materials in a professional archive that guarantees their preservation and accessibility over time. Efforts for making materials more easily accessible to communities and the general public are important, and can be very valuable for crowd-sourced collection of materials, but they need to be linked to or integrated in a professional archive to ensure that the data is preserved long-term.

Scroll to top