01. December | 2022

On the Way to the Participatory Archive: Digitalisation and Citizen Science in the IRS Scientific Collections

Citizen Science is finding its way into more and more areas of research, including archival work. The IRS Scientific Collections have not only made great strides in recent years in providing technical infrastructures for the digitalisation of their holdings. They are also leading the way in public participation in the indexing of holdings, developing solutions for many other specialised archives in the process.

Citizens identify insects, analyse baby sounds and document the orbit of stars. And with their commitment, they are advancing the sciences a great deal. The so-called Citizen Sciences have produced a diverse repertoire of projects in recent years. The natural sciences in particular have become pioneers of a promising social development. On the Zooniverse portal for Citizen Science projects hosted by Oxford University, researchers from all over the world can upload their data and share it for processing.

But Citizen Science has also found its way into historical research. Small museums with a strong regional focus in particular ask their users to come to the archives for help with indexing. And so one can sometimes observe citizens there deciphering Sütterlin texts, cleaning insect boxes and holding almost faded glass slides up to the light to describe their contents. In doing so, they help solve one of the most common problems of small archives: For these are often in possession of great collections, sometimes even have quite good equipment in scanners and computers. But they lack the staff to index the holdings, i.e. to describe the contents of their boxes and cabinets, cupboards and drawers and to provide them with computer-readable metadata. And so valuable treasures slumber in the archives, and it will mostly be years before the public even learns that they exist.

The scientific collections of the IRS with their unique holdings on the building and planning history of the GDR are a prototype of these small archives: their collection of the holdings of the former Institute for Urban Planning and Architecture (ISA) of the GDR Building Academy, the Association of Architects of the GDR and a large number of estates of the leading heads of GDR architecture and planning is unparalleled. With the help of special funding by the Leibniz Association, the possibility has been created since 2020 to digitise the holdings and present them online. Hardware and software for managing the holdings and describing their contents was purchased for this purpose, thus creating the basis for a modern digital infrastructure. However, personnel funds for the actual digitalisation and indexing work always flow only sporadically, often within the framework of third-party funded projects. Staff is often only hired for a limited period of time and at the end of the projects, a large part of the accumulated knowledge is lost when staff members leave.

In view of a call for proposals from the Federal Ministry for Economic Affairs and Climate Action (BMWK), which solicited cooperation from science and business for the benefit of the common good, the idea arose to set up a third-party funded project together with the company Programmfabrik, manufacturer of the digital asset management system Easydb that was purchased. The goal: to bring together the know-how of IT experts, archivists and scientists in order to make citizen knowledge usable for the indexing of the collections. This is how the project “Development of a Citizen Science and Semantic Web-based Procedure for Digitising and Indexing the Holdings of Small Archives” or “CitizenArchives” for short, came into being, led by Rita Gudermann and Paul Perschke.

The conditions were promising, as the Scientific Collections of the IRS can draw on a lively circle of interested people. For years, the research focus Contemporary History and Archives at the IRS has been organising so-called Workshop Talks and bringing together contemporary witnesses and scholars at lectures and over coffee. The bond between the bequeather and the researcher is close, the conversations are intense and sometimes conflict-laden. It often becomes clear that contemporary witnesses often know the records of individual institutions better than those working in the archives, especially in the case of a collection that is still being built up. When it comes to evaluating the archival documents and placing them in the larger context, however, opinions can diverge.

So why not involve these very interested people in describing the collections? Perhaps they would even be willing to add to the collections by uploading their own photos and other materials? For this idea to work, however, the existing user interfaces had to be adapted, because citizens are not really enthusiastic about deeply nested categories and complex fields for metadata, as offered by professional cataloguing software and as they delight the hearts of archive staff. Therefore, it was considered to provide the latter with a specially secured access to the software, in which already digitised holdings are recorded. The entries of the participants were to be collected in separate metadata fields. The images to be described now appear on an easier-to-use interface, and the fields for entering data have been greatly reduced. It will also be possible to upload new images.

Whether this idea worked was first tested among family and friends and then at the Lange Nacht der Wissenschaften (Long Night of the Sciences) in Berlin with the help of a simple user interface of the existing database infrastructure. Although not quite as well attended as in previous years due to the pandemic, queues formed at the PC set up at this event. Visitors who grew up in the GDR in particular were delighted to rediscover the streets and buildings of their childhood and youth. Initial insights into the users and their preferences were also gained during the Lange Nacht: It became apparent that older people liked to use the offer in the company of their children. Access by district offered the best access to the material for most people. And data entry, it quickly became clear, worked best with the help of free fields in which information of all kinds could be entered. These findings are now being refined with the help of further test interfaces and test users.

However, such an approach should not remain completely without control. For who can guarantee that the Citizen Scientists do not make mistakes when describing the material, that their memory does not deceive them, that they do not perhaps even deliberately embellish or even provide misrepresentations? The solution to this serious problem was included in the project proposal. There will be staggered editing rights for different users. The data entered will also be checked individually by an online editorial team. And finally, semantic web technologies will be used that can use language analysis to draw conclusions about the correctness of entries and flag suspicious contributions for further review. This also brings artificial intelligence technology into citizen science. Before this technology can come into play, however, there must first be a sufficiently large number of real user entries.

In the future, these new functions will be integrated into the Easydb software in the form of open-source plug-ins and will thus be available to other cultural institutions. In addition, the company Programmfabrik is preparing an open-source solution into which archives that do not use the Easydb software can also upload their images for description by citizens. So an infrastructure is being developed that can be used in the long term not only for the “CitizenArchives” project, but also for the classic indexing of archive holdings by employees. In the long term, there should also be an integration of the “CitizenArchives” interface into the portal of the IRS research focus with the scientific collections, which is currently being created and will enable online access to the extensive holdings for the first time.

The expected scientific and archival added value is great: the time saved in indexing the archival records is only one factor. The transfer of knowledge that the citizens provide through their work should not be underestimated. Entirely new bodies of knowledge, vocabularies and perspectives are opened up for the description of the holdings. Community building is also an important factor, because the joint work on making the archives available binds people together. To ensure that the whole thing is on a solid footing, scientific support for the project has been provided. The first promising results give hope that larger holdings of the scientific collections will soon be accessible to the public. Thus, the project aims not only at working with, but ultimately above all for the citizens.


Project Management

Since January 2020, Rita Gudermann is head of the institute-funded project to improve the digital infrastructure of the IRS's scientific collections. She has many years of professional experience as a research assistant at the Institutes of Economic History of the Free University of Berlin and the Humboldt University of Berlin, as well as an IT consultant for DAM and ERP systems and in the development of a historical image database.


Research Associate

Paul Perschke has been employed at the IRS as a research assistant in the BMWK/IGP project CitizenArchives since December 2021. Within the framework of this project, a knowledge and indexing platform and processes are to be developed, taking into account Citizen Science approaches, which will enable small archives such as the Scientific Collections of the IRS to involve interested users in the digitization and indexing of their holdings. Paul studied political science and public law at the University of Trier (BA) as well as Historical Urban Studies at the Center for Metropolitan Studies at the Technische Universität Berlin.