JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organizers at office@dgof.de.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Only Sessions at Date / Time

Session Overview

Session

Workshop 2

Time:

Monday, 31/Mar/2025:

10:00am - 1:00pm

Location: Konferenzraum III

Presentations

Structured information extraction with LLMs

Paul Ferdinand Simmering

Q Agentur für Forschung GmbH, Germany

Duration of the workshop

2,5

Target groups

Analysts and researchers working with text data, e.g. transcripts, news articles, social media posts or reviews

Is the workshop geared at an exclusively German or an international audience?

International

Workshop language

English

Description of the content of the workshop

This workshop is an introduction to the application of Large Language Models (LLMs) for structured information extraction in market research and social sciences. Participants will implement solutions to natural language processing tasks such as text classification, entity recognition, and sentiment analysis. The session includes hands-on exercises in Python using the library "instructor". Participants will learn about strategies for prompting, few-shot examples and fine-tuning. The approaches taught are compatible with a wide range of open source and commercial models. Discussion sections of the workshop will cover the methodological and technical possibilities and limitations of LLMs for information extraction.

Goals of the workshop

Get hands-on experience with structured information extraction.
Get an overview of available models, tools and prompting tactics
Learn about evaluation, efficiency and limitations
Share experiences and use cases

Necessary prior knowledge of participants

Basic knowledge of Python. R users can use the guide recommended literature to get up to speed quickly. The code examples in the workshop can be followed with minimal coding knowledge, extending them requires a bit more.

Literature that participants need to read prior to participation

Starter guide which will be sent before the workshop. It will contain instructions for using Google Colab and installing the required Python packages.

Recommended additional literature

Primer on Python for R users: https://rstudio.github.io/reticulate/articles/python_primer.html

Information about the instructor

Paul Simmering is a data scientist at Q Agentur für Forschung where he works on social media and review analysis. He has presented research on sentiment analysis at GOR 23 and GOR 24

Maximum number of participants

Will participants need to bring their own devices in order to be able to access the Internet? Will they need to bring anything else to the workshop?

Participants will need to bring a laptop. An OpenAI API key will be provided for use during the workshop. The recommended development environment for beginners is Google Colab, which is free and runs in the browser. A starter guide will be provided. Advanced users are welcome to use an IDE of their choice and are also welcome to use a different LLM platform than OpenAI that is compatible with instructor, such as Anthropic, Cohere, Gemini and local models using Ollama.

GOR 25
General Online Research Conference 2025

Freie Universität Berlin - Henry Ford Building

31 March - 2 April 2025

Conference Agenda