What can you do when you find a problem that could be solved with the help of AI, but are currently saddled with an inflexible legacy system? How do you handle problems that need solving now without investing too much time and effort in a solution that will be made redundant within a few years by ever evolving technology? These questions are the starting point of our presentation - a case study from the National library of Sweden (KB) involving automation of metadata enrichment for audio-visual (AV) materials in a legacy system, assisting staff in anticipation of future restructuring.
Currently, AV materials are catalogued using metadata purchased from a Swedish news agency, including program schedules, air times, and subject matter. However, for many programs - particularly live or news broadcasts - this metadata is incomplete or too generic to be useful. In these cases, human staff manually supplement archive searchability by listening to news broadcasts and writing their own summaries. This manual work is time-consuming and can require experienced staff to spend up to a full workday each week on this task.
To address these issues, we have developed an automated solution that makes use of speech-to-text and large language models. Selected TV and radio programs are transcribed using KB-Whisper [1], a transcription model trained in-house. The transcribed text is then summarized using a language model (currently Llama 3.1-8B-Instruct [2]). These summaries are automatically added to the institution’s national AV catalogue, SMDB, in a dedicated field clearly labeled as AI-generated. Catalogue staff manage only the selection of programs to be processed - everything else happens without manual input. This means that catalogue staff can redirect their efforts from basic summarization toward tasks that require human judgment and domain expertise. Furthermore, the searchable AI-generated summaries enhance discovery and access for end users, improving the overall usability of the system without the need for significant development in the legacy platform.
The process has been highly collaborative. Data scientists, developers, and cataloging experts have worked closely together throughout. In addition to solving a technical problem, the project has fostered important conversations about the role of AI in cultural heritage institutions. Topics have included quality standards for AI-generated text, transparency toward users, and internal expectations for automated processes.
Although our solution is not a long-term replacement for a modern infrastructure, it threads the needle of using the tools we have now while anticipating changing circumstances in the future. The AI-enhanced cataloguing pipeline provides practical value now, easing workloads, improving metadata quality, and building organizational experience with AI technologies.
In conclusion, the project shows how AI can be used to improve public sector operations even within the limitations of outdated systems. The focus on incremental improvement, cross-functional collaboration, and transparency has helped turn a short-term constraint into an opportunity for innovation. As institutions across the cultural heritage sector face similar challenges - aging systems, rising content volumes, and increasing user expectations - this type of applied AI work offers a concrete and transferable model. It demonstrates that meaningful change is possible without waiting for perfect conditions, and that small steps can lead to significant long-term impact.
Background:
Since 2019, KB has operated KB Lab, a data lab focused on training AI models for the Swedish language and supporting researchers with access to structured collection data.
The KBx initiative was launched by KB to begin implementing AI within the organization through small-scale solutions and an experimental approach. The core team is small, and additional expertise - such as subject matter experts, system developers, and others - is brought in as needed, depending on the nature of the ongoing work.
By developing tangible products, even on a small scale, we hope to make the advantages and possibilities of AI more accessible and easier to understand for our staff. Through this approach, we aim to foster a sense of ambassadorship for AI, making it more approachable.
References:
[1] Leonora Vesterbacka, Faton Rekathati, Robin Kurtz, Justyna Sikora, Agnes Toftgård. Swedish Whispers; Leveraging a Massive Speech Corpus for Swedish Speech Recognition. https://arxiv.org/abs/2505.17538 (2025).
[2] Grattafiori, Aaron, et al. The llama 3 herd of models. https://arxiv.org/abs/2407.21783 (2024).