As libraries navigate the integration of generative AI into their services and operations, one of the most promising applications is its use with data-driven evaluation and insight. This workshop offers an accessible, hands-on introduction to using prompt engineering with generative AI models to create Python code for analyzing library data. Designed for librarians, archivists, digital humanities professionals, and other practitioners who are confronted with regularly-collected data about their patrons, collections, needs, and questions, this workshop emphasizes practical skills that support reflective, responsible, and user-centered library analytics.
While many professionals in galleries, libraries, archives, and museums (GLAM) are already experimenting with ChatGPT or similar tools, fewer have had the opportunity to work with generative models as coding assistants. This workshop bridges that gap by demonstrating how AI can support Python-based text analysis, without requiring any deep programming experience. Using the freely accessible Google Colab platform, participants will learn how to generate, evaluate, and run code that performs tasks such as data cleaning, keyword extraction, time-based analysis, and basic natural language processing (e.g., sentiment analysis or named entity recognition).
The workshop will use an anonymized dataset of chat transcripts between university-affiliated patrons and librarians and participants will explore how reference interactions can illuminate patterns in information seeking behavior, highlight service gaps, and inform the design of responsive, evidence-based support models. More broadly, participants will learn skills that will allow them to evaluate other types of datasets common to GLAM settings, such as visitor interactions, inventories and catalogs, authority files, patron questions, and use those analyses and visualization to better tell the stories of their institutions.
The workshop will begin with a conceptual and ethical framing of the work: what does it mean to use AI to study patron conversations? How do we protect privacy, ensure interpretive validity, and maintain human-centered values? Next, participants will explore prompt engineering strategies for eliciting quality Python code. Through collaborative experiments and structured exercises, attendees will refine prompts to generate reproducible code that is both transparent and adaptable.
The hands-on portion of the session will guide participants through working with a pre-prepared, de-identified chat transcript dataset. Attendees will use generative AI to build code blocks that:
- Clean and format textual data
- Identify recurring topics, questions, and keywords
- Analyze time-based usage trends
- Perform introductory sentiment analysis
- Visualize data to support advocacy and storytelling
The workshop concludes with a guided discussion on interpreting results and applying insights in context, for example, to enhance instructional outreach, optimize staffing models, or design more targeted digital services. Participants will be encouraged to share ideas for adapting the methods to their local environments, including other types of communication logs or reference data.
All materials, including a Google Colab notebook template, sample prompts, an ethical guide, and optional reading list, will be shared for reuse and adaptation. No prior coding experience is required; the workshop is intended for professionals with curiosity and an interest in bridging the gap between AI and practical library evaluation.
Proposed Workshop Outline
Part 1: Foundations (20–30 min)
- Welcome and goals
- Brief overview of chat reference as a data source
- Ethical use: privacy, de-identification, institutional review
- Prompt engineering basics: effective vs. ineffective prompts
Part 2: Prompting the Model (30 min)
- Live demo: using generative AI to write a Python function to:
- Load and clean chat transcript data
- Identify user types or question types
- Participants generate prompts based on sample questions (possibly in breakout groups)
- Compare generated code and refine
Part 3: Analysis in Action (30 min)
- Participants copy a Google Colab notebook with structured input/output
- Guided exploration of:
- Word frequency and n-gram analysis
- Time series: busiest times/days for engineering questions
- Sentiment or emotional tone detection (optional)
Part 4: Interpretation & Next Steps (20–30 min)
- Visualizing and interpreting the results
- Discussing findings: how could this impact:
- Instruction
- Staffing
- Resource design
- Group reflection: how might participants use this in their own institutions?
- Wrap-up and resource list
Proposed Outcomes
Participants will leave with:
- A working understanding of how to use prompt engineering to write Python code with generative AI
- Hands-on experience in analyzing anonymized library chat transcripts in a cloud-based environment
- Strategies for applying this method to their own datasets or service questions
- Templates and resources for reuse in training, internal assessment, or collaborative projects
Resources Required
- Google Colab (no account required, browser-based)
- Internet access
- Sample datasets and handouts provided by instructor
Additional Information
Participants are welcome to use their own datasets for experimentation (e.g., anonymized reference transcripts), though this is not required and instruction will be designed for the sample dataset provided. A follow-up informal session or online space will be available for those interested in deeper exploration or collaboration.