Conference Agenda (All times are shown in Mountain Daylight Time (MDT) unless otherwise noted)
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
Session Chair: Junhua Ding, University of North Texas, USA
Location:Imperial Ballroom 2, Third Floor
Presentations
10:30am - 11:00am ID: 458 / PS-02: 1 Long Papers Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting Policies Topics: Privacy; Ethics; and Regulation (information ethics; AI ethics; open access; Information security; information privacy; information policy; legislation and regulation; international information issues) Keywords: AI agents; multiagent debate; large language models (LLMs); human-in-the-loop; value sensitive design
"Tipping the Balance": Human Intervention in Large Language Model Multi-Agent Debate
Haley Triem, Ying Ding
The University of Texas at Austin, USA
Methods for eliciting reasoning from large language models (LLMs) are shifting from filtering natural language “prompts” through contextualized “personas,” towards structuring conversations between LLM instances, or “agents.” This work expands upon LLM multiagent debate by inserting human opinion into the loop of generated conversation. To simulate complex reasoning, LLM instances were given United States district court decisions and asked to debate whether to "affirm" or "not affirm" the decision. Agents were examined in three phases: “synthetic debate,” where one LLM instance simulated a three-agent discussion; “multiagent debate,” where three LLM instances discussed among themselves; and “human-AI debate,” where multiagent debate was interrupted by human opinion. During each phase, a nine-step debate was simulated one-hundred times, yielding 2,700 total debate steps. Resulting conversations generated by synthetic debate followed a pre-set cadence, proving them ineffective at simulating individual agents and confirming that mechanism engineering is critical for multiagent debate. Furthermore, the reasoning process backing multiagent decision-making was strikingly similar to human decision-making. Finally, it is discovered that while LLMs do weigh human input more heavily than AI opinion, it is only by a small threshold. Ultimately, this work asserts that careful, human-in-the-loop framework is critical for designing value-aware, agentic AI agents.
11:00am - 11:30am ID: 443 / PS-02: 2 Long Papers Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting Policies Topics: Human-Computer Interaction (usability and user experience; human-technology interaction; human-AI interaction; user-centered design) Keywords: ChatGPT, Search Strategy, Systematic Review, Performance Evaluation
An Empirical Study Evaluating ChatGPT’s Performance in Generating Search Strategies for Systematic Reviews
Fei Yu, Heather Kincaide, Rebecca Carlson
University of North Carolina at Chapel Hill, USA
This study evaluated the performance of ChatGPT-3.5 and ChatGPT-4 in developing search strategies for systematic reviews. Using the Peer Review of Electronic Search Strategies (PRESS) framework, we employed a two-round testing format for each version. In the first round, both versions displayed comparable competencies when assessed quantitatively by the PRESS measures. However, qualitative feedback from two professional health sciences librarians indicated that ChatGPT-4 outperformed ChatGPT-3.5, particularly in suggesting MeSH term inclusion and refining search strategy formulations. In the second round, prompts were refined based on the feedback from the previous round of testing. Both qualitative and quantitative evaluation results confirmed ChatGPT-4’s superiority. This study provides empirical evidence of advancements in language model capabilities, highlighting ChatGPT-4’s enhanced efficiency and accuracy in developing search strategies for systematic reviews.