Programme de la conférence

Vue d’ensemble et détails des sessions pour cette conférence. Veuillez sélectionner une date ou un lieu afin d’afficher uniquement les sessions correspondant à cette date ou à ce lieu. Cliquez sur une des sessions pour obtenir des détails sur celle-ci (avec résumés et téléchargement si disponibles).

Notez que tous les horaires indiqués se réfèrent au fuseau horaire de la conférence. L’heure actuelle de la conférence est : 16.08.2025 00:09:54 BST

 
 
Vue d’ensemble des sessions
Session
PSG 1 - e-Government_B
Heure:
Jeudi, 28.08.2025:
8:30 - 10:30

Président(e) de session : Dr Shirley KEMPENEER, Tilburg University

"Large language models and chatbots in the public sector"

 


Afficher l’aide pour « Augmenter ou réduire la taille du texte du résumé »
Présentations

Chatbot Utilization in the Public Sector: A Systematic Literature Review of types, conditions, effects and users

J. Ignacio CRIADO, Carlos Jiménez Cid

Universidad Autónoma de Madrid, Spain

Chatbots have become the primary artificial intelligence (AI) interface through which citizens interact with public administrations around the world. Since the pioneering research of Androutsopoulou et al. (2018), the literature on this subject has been characterized as a multidisciplinary field. While some studies approached it under a technological design perspective (Androutsopoulou et al., 2018), others have focused its attention on the interaction experience between chatbots and user characteristics (Følstad and Bjerkreim-Hanssen, 2024). By its part, in the field of public administration, relevant studies report factors of chatbot adoption and implementation by public agencies (Chen et al., 2024; Maragno et al., 2023), public values positions (Hemesath and Tepe, 2024), and analysis on public servants and chatbot characteristics (Li and Wang, 2024).

Nevertheless, there is a gap in the literature regarding the connection between these trends, which hinders a comprehensive understanding of the underlying relationships between chatbot implementation dimensions within the public sector context. In this study, we propose that different types of chatbot utilization may lead to specific effects, encompassing tasks from providing public information to enabling bureaucratic transactions with public agencies (Makasi et al., 2020). These benefits of chatbot utilization (as well as potential challenges), may vary depending on the stakeholders involve and its expectations (Li and Wang, 2024). While citizens may perceive certain benefits (e.g., services time saving), public administration may recognize different kinds of advantages (e.g., enhance citizen trust, public service efficiency) (Hemesath and Tepe, 2024). In this regard, it is important to highlight that chatbot utilization is influenced by the specific conditions of each stakeholder involved. For example, factors such as the sensitivity of the public policy area (Aoki, 2020), the design technologies involved (Makasi et al., 2021) or the overall interaction experience with the chatbot (Følstad and Bjerkreim-Hanssen, 2024) can significantly impact the realization of such utilization benefits.

To address this research gap, we have adopted the analytical framework of Savarov et al. (2017) on open data utilization, and propose the following research questions: RQ1. What are the primary purposes and categories of chatbot utilization in the context of the public sector? RQ2. What are the potential effects of chatbot implementation in the context of public sector? RQ3. What conditions regulate the impact of chatbot utilization in the context of the public sector? RQ4. Who are the primary users of chatbots, and what are their key characteristics or roles in a public sector context? For these purposes, we have conducted a systematic literature review (SLR) on chatbot utilization in the public sector. The main contribution of this study is the proposal of a chatbot utilization analytical framework that enhances the understanding of how different dimensions interact to maximize expected benefits and mitigate potential challenges in public service provision.



Evaluating Public Servants’ Perceptions and Regulatory Implications of LLM-Based Chatbots in Street-Level Organizations

Raimund LEHLE

University of Applied Sciences – Public Administration and Finance, Ludwigsburg, Germany

The integration of generative artificial intelligence (genAI) into public administration is a contemporary focus, driven by the rapid advancements in AI technology and the increasing demand for more efficient and responsive public service. This study examines the application and perceptions of large language model (LLM)-based chatbots among public servants in an exemplary street-level organization with its double binding obligation to also protect citizens from potential algorithmic harm ​(Kuziemski und Misuraca 2020)​. We conducted an experimental study to explore the use cases, as well as perceived chances and risks surrounding this technology from the perspective of these key public administrators, and supplement this study with expert interviews on the gathered responses.

This study employs categories derived from public value theory ​(Andersen et al. 2012)​ to explore the sentiments expressed regarding AI-enhanced service provision. This theoretical framework is complemented by the employment of principles of responsible AI ​(Papagiannidis et al. 2025)​. These additional perspectives complement public value theory with explicit ethical implications of AI integration in public services. The present study involved a sample of public servants from a municipal-level organization. To gather in-depth insights into the regulatory and ethical dimensions of AI deployment, expert interviews were conducted with legal scholars and municipal AI strategists. The collection of data was executed as preceding survey, experiment and semi-structured interviews, and was subsequently analyzed through the employment of a combination of qualitative and quantitative methodologies.

Preliminary findings and literature indicate a mixed response; while many public servants recognize the potential efficiency gains of LLM-based chatbots, aligning with ​Cantens​ ​(2024)​, concerns persist regarding data privacy, decision-making transparency, and potential biases inherent in AI systems. We identify regulatory gaps and incoherences concerning AI deployment in public sector contexts, highlighting the need for coherent guidelines to govern the integration and operation of AI tools in public service and enable case workers to proficiently use these tools.

This work contributes to the discourse on the practical implications of AI in public administration, providing practical insights into the use cases and implications for public value generation. Preliminary findings suggest that while AI has the potential to significantly enhance service delivery, it is crucial to address the identified risks through coherent regulatory frameworks and targeted training programs. We recommend the development of specific policy adjustments, such as the establishment of AI ethics boards and the implementation of regular audits to ensure compliance with ethical standards ​(Desouza et al. 2020)​ as well as coherent digital procedural legislation. Additionally, training initiatives for public servants should focus on enhancing their understanding of AI technologies and their ethical implications, thereby fostering a more informed and responsible approach to AI integration ​(Ahn und Chen 2022)​, ensuring sensible AI deployment in public services.



Open to open-source AI: Navigating AI model choice in the public sector

Nicholas ROBINSON

Hertie School of Governance, Germany

The public sector is increasingly seeing adoption of artificial intelligence (AI) tools. High quality open-source AI (OSAI) options are available, but much of the current attention in government is on proprietary options such as Copilot and ChatGPT. There are parallels with the discourse around open-source software (OSS) and proprietary software, which while used for certain functions in Agencies, has not seen widespread adoption despite backing from technical and political spheres. Proponents of open-source would indicate this has potentially increased costs while limiting competition and broad-based innovation. 

Drawing on the frameworks and evidence used to study OSS uptake, I draw on interviews with 31 decision-makers on AI adoption in Australian, Canadian and German Agencies to analyse key factors in the feasibility of open-source technologies in general and OSAI in particular, compared to their proprietary counterparts. I find organisational factors are highly influential on the acceptance of open-source, while technological attributes like usability and environmental factors like the availability of support are also important.

While these factors are also relevant for OSAI, technological characteristics like fit, control and the availability of hardware infrastructure appear more critical for the decision-making process. As AI models are relatively easy to benchmark and switch between, fears of lock-in were not as strong an influence. Furthermore, organisational considerations like digital sovereignty and data protection, which are not prominent in OSS considerations, appear more relevant in the current considerations. Agencies were also seeking central government guidance on AI model choice, but open-source communities were less relevant, although this may change as the sector evolves. Although AI is a fast evolving technology, the decision to choose between proprietary or OSAI requires significant choices to be made today — like investment in hardware and how to ensure sovereignty — that will echo into the future.



Ground-truth is law: A Systematic Review of Evaluation Methods for Legal Case Retrieval Systems

Julian Michael Quintijn LEEFFERS

Tilburg University, Netherlands, The

The digitalisation of public sector information is making large volumes of legal decisions publicly available, creating opportunities for Legal Case Retrieval (LCR) systems to enhance transparency and consistency in judicial and administrative decision-making. Yet assessing whether these systems work effectively depends on well-constructed ground-truth datasets—labelled collections of legal documents indicating which cases are considered relevant for a given query or reference case. Current practices vary widely and often fail to reflect the nuanced legal information needs of practitioners. This study systematically reviews 28 academic works covering 31 datasets, examining evaluation frameworks, labelling methods, and the relevance dimensions they embody. Findings reveal a dominance of topical and algorithmic relevance, with situational, cognitive, and domain-specific aspects underrepresented. The paper calls for transparent, multidimensional, and legally grounded evaluation practices to ensure LCR systems align with the broader goals of public administration and legal information seeking behaviour. Recommendations include leveraging large language models for explainable annotations and incorporating diverse relevance dimensions.