Level of experience for attendees with relevant technologies: None required/beginner
The rapid proliferation of AI tools embodies the conference theme "AI Everywhere, All at Once" – from embedded features in library management systems to standalone transcription services, AI has suddenly appeared across GLAM workflows. However, this ubiquity often comes with hidden costs: vendor lock-in, opaque algorithms, unpredictable pricing, and limited control over how AI processes cultural heritage materials. Many GLAM professionals find themselves overwhelmed by the pace of change, unsure whether to embrace, resist, or fear these new technologies.
Open source AI offers a strategic alternative that aligns with core GLAM values: transparency, community collaboration, sustainability, and democratic access to knowledge. This hands-on workshop introduces open source AI not just as a tool to use, but as an ecosystem to contribute to – transforming GLAMs from passive consumers to active collaborators in building AI solutions by and for the cultural heritage sector.
Why Open Source AI for GLAMs?
Unlike proprietary solutions, open source AI provides transparency into algorithmic processes, ensuring ethical alignment with institutional missions. It offers long-term sustainability by reducing dependency on commercial vendors whose priorities may not align with cultural heritage goals. Most importantly, it enables collective problem-solving: when multiple institutions face similar challenges – from handwritten text recognition to multilingual metadata enhancement – open source collaboration prevents duplicated effort and builds shared expertise.
The current AI landscape forces GLAMs into difficult choices between expensive proprietary solutions and developing expertise in-house. Open source AI provides a third path: collaborative development where institutions can contribute according to their capabilities while benefiting from collective knowledge.
Workshop Structure and Learning Through Practice
Participants will engage with three core components through hands-on exercises using browser-based tools. While no software installation is required, participants can optionally install local AI tools like LM Studio for extended experimentation. The workshop uses the Hugging Face Hub as our primary platform, supplemented by transformers library for model deployment and Whisper for audio processing.
1. Understanding the Open Source AI Ecosystem
We'll explore what "open source AI" means across the complete stack: open models (weights and architectures), open datasets, open tools, and open development practices. Participants will learn to navigate the Hugging Face Hub to find, evaluate, and deploy models suitable for GLAM contexts. We'll demonstrate practical workflows for model discovery, testing, and integration into existing systems.
2. Case Study: Building Custom Text Classifiers
Drawing from recent advances in model distillation, we'll demonstrate how GLAMs can develop custom text classification solutions for their specific needs. Whether categorizing archival descriptions, identifying document types, or analyzing user queries, text classification is a foundational AI task with broad GLAM applications.
Participants will work through the complete process: using large language models to generate training labels for smaller, efficient models like ModernBERT. This approach addresses a common GLAM challenge – having domain-specific classification needs but lacking labeled training data. By leveraging large models' reasoning abilities to create synthetic labels, institutions can build highly targeted classifiers that run efficiently on modest hardware while maintaining full transparency and control.
The hands-on exercise will demonstrate this workflow using sample GLAM data, showing how to prompt large models for consistent labeling, evaluate label quality, and train efficient classifiers suitable for production use.
3. Case Study: Rapid Prototyping with Local Tools
Using LMStudio as our primary tool, we'll explore how GLAMs can develop AI proof-of-concepts with minimal resources and infrastructure. LMStudio enables running powerful open source models locally, providing an ideal environment for experimentation without cloud dependencies or ongoing costs.
Participants will learn to use LMStudio for rapid prototyping of GLAM applications: testing different models for transcription tasks, experimenting with prompt engineering for metadata enhancement, and evaluating model performance on institution-specific data. This approach demonstrates how GLAMs can move from "AI curious" to "AI capable" using accessible tools and modest hardware investments.
We'll cover practical considerations including model selection for different hardware constraints, prompt optimization techniques, and strategies for moving from prototype to production systems.
Building Collaborative GLAM AI Infrastructure
Moving beyond individual implementations, we'll explore three key areas where GLAMs can collaborate to build sustainable AI capacity:
Shared Benchmarking Approaches: The cultural heritage sector lacks standardized methods for evaluating AI model performance on GLAM-specific tasks. We'll discuss frameworks for developing community benchmarks that allow institutions to compare both open source and proprietary solutions fairly. This includes creating evaluation datasets that reflect real GLAM challenges, establishing metrics that align with cultural heritage values, and building systems for sharing evaluation results across the community.
Collaborative Dataset Development: One of the most powerful aspects of open source AI is the ability to pool resources for dataset creation. We'll explore strategies for collaborative dataset development while respecting copyright, cultural sensitivities, and institutional policies. This includes approaches for distributed annotation, shared data standards, and mechanisms for ensuring contributed datasets serve broad community needs rather than narrow institutional interests.
Open Tool and Workflow Sharing: Rather than each institution developing similar solutions independently, we'll examine how GLAMs can create and maintain shared tools and workflows. This includes developing modular, adaptable solutions that can be customized for different institutional contexts, creating documentation standards that enable effective knowledge transfer, and establishing maintenance models that ensure long-term sustainability of community resources.
Critical Technology Assessment
The workshop addresses when open source AI may not be optimal – examining computational requirements, specialized domain needs, and institutional capacity limitations. Equally important is developing frameworks for evaluating the growing presence of embedded AI in vendor tools.
Many GLAM systems now include AI features that institutions cannot inspect, control, or fully understand. We'll discuss strategies for:
- Assessing vendor AI implementations and their implications for collections and users
- Understanding data usage policies and algorithmic accountability in proprietary systems
- Developing institutional policies for managing AI tools you don't control
This critical assessment helps participants make informed decisions about when to embrace open source alternatives versus when to accept vendor solutions, and how to maintain agency regardless of the choice.
Target Audience and Pedagogical Approach
Designed for GLAM professionals with varied technical backgrounds, the workshop balances practical implementation with critical reflection. No prior AI experience is required, but participants should be comfortable with basic computer tasks and curious about technical possibilities. The semi-technical approach ensures participants understand enough about the technology to make informed institutional decisions without requiring deep programming knowledge.
The workshop emphasizes learning through doing rather than passive consumption of information. Each concept is introduced through practical exercises that participants can immediately apply to their own contexts.
Workshop Timetable (2 hours)
- Introduction and Open Source AI Ecosystem Overview (20 mins)
- Hands-on: Navigating Hugging Face Hub and Model Discovery (20 mins)
- Case Study 1: Building Custom Text Classifiers with ModernBERT (20 mins)
- Break (10 mins)
- Case Study 2: Rapid Prototyping with Local Tools (30 mins)
- Collaborative GLAM AI Infrastructure & Critical Assessment (15 mins)
- Wrap-up and Next Steps (5 mins)
What You'll Need
Required:
- Laptop with modern web browser (Chrome or Firefox recommended) and internet access
- Create a free Hugging Face account before the workshop at huggingface.co
Optional:
- Install LM Studio for local experimentation (requires 8GB+ RAM minimum, 16GB+ recommended)
- Sample datasets from your institution for contextualized examples
Note: If you cannot install software on your laptop, you can fully participate using browser-based tools. All core exercises work through your web browser.
What to Prepare
- Consider specific AI use cases or challenges at your institution
- Think about potential collaborative opportunities with other GLAMs
- No prior AI or programming experience required
- Optional: Review pre-reading materials that will be sent before the workshop
Workshop Outcomes
Participants will gain practical skills in using open source AI tools, understand collaborative development models for sustainable AI in GLAMs, develop critical evaluation frameworks for AI implementations, and build networks for ongoing collaboration in the open source AI ecosystem.