JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organizers at meetings@asist.org.

Conference Agenda (All times are shown in Greenwich Mean Time (GMT) unless otherwise noted)

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Session Overview

Session

Paper Session 14: Data Science and Large Language Models

Time:

Monday, 30/Oct/2023:

4:00pm - 5:35pm

Session Chair: Jacek Gwizdka, University of Texas at Austin, USA

Location: Chalon, 1st Floor, Novotel

Presentations

4:00pm - 4:15pm
ID: 119 / PS-14: 1
Short Papers
Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting Policies
Topics: Domain-Specific Informatics (cultural informatics; cultural heritage informatics; health informatics; medical informatics; bioinformatics; business informatics; crisis informatics; social and community informatics
Keywords: Borrowing, copyright, data science, music, note patterns

Detection of Musical Borrowing Using Data Science

Steven Walczak¹, Thomas Moore-Pizon. Jr.²

¹University of South Florida, USA; ²Kaiser University, University of South Florida, USA

Data science may be used to determine similarities between musical scores. Programs are written in C++ to capture note progressions from musical scores and to compare progressions from different songs to identify overlapping areas. These tools enable the study of musical borrowing across musical genres and may assist in copyright violation cases. Results indicate that within the Celtic music genre, borrowing occurs across greater than 10% of the songs.

4:15pm - 4:40pm
ID: 376 / PS-14: 2
Long Papers
Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting Policies
Topics: Data Science; Analytics; and Visualization (data science; data analytics; data mining; decision analytics; social analytics; information visualization; images; sound)
Keywords: Citation predication, Team composition, Team structure, XGBoost, Explainable AI, SHAP

Using Explainable AI to Understand Team Formation and Team Impact

Huimin Xu¹, Min Song², Maytal Saar-Tsechansky¹, Ying Ding¹

¹The University of Texas at Austin, USA; ²Yonsei University, South Korea

The citation of scientific papers is considered a simple and direct indicator of papers’ impact. This paper predicts papers’ citations through team-related variables, team composition, and team structure. Team composition includes team size, male/female dominance, academia/industry collaboration, unique race number, and unique country number. Team structures are made up of team power level and team power hierarchy. Team members’ previous citation number, H-index, previous collaborators, career age, and previous paper numbers are a proxy of team power. We calculated the mean value and Gini coefficient to represent team power level (the collective team capability) and team power hierarchy (the vertical difference of power distribution within a team). Taking 1,675,035 CS teams in the DBLP dataset, we trained the XGBoost model to predict high/low citation. Our model has reached 0.71 in AUC and 70.45% in accuracy rate. Utilizing Explainable AI method SHAP to evaluate features’ relative importance in predicting team citation categories, we found that team structure plays a more critical role than team composition in predicting team citation. High team power level, flat team power structure, diverse race background, large team, collaboration with industry, and male-dominated teams can bring higher team citations.

4:40pm - 5:05pm
ID: 175 / PS-14: 3
Long Papers
Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting Policies
Topics: Libraries (librarianship; libraries; museums; other cultural institutions; information services; scientific and technical information; technology in libraries)
Keywords: Large language models (LLMs); ChatGPT; authorship; attribution; library webpages.

What Is a Person? Emerging Interpretations of AI Authorship and Attribution

Heather Moulaison-Sandy

University of Missouri, USA

As of spring 2023, the scholarly community has been eager to explore how AI-produced content should be integrated into both academic writing and scholarly publishing. This paper investigates the prevailing responses to the introduction of ChatGPT in November 2022 and the interest that has been afforded it by both the academy and the publishing industry. A review of the published literature on aspects of ChatGPT authorship was carried out, finding that government and the publishing industry have unequivocally asserted that large language models (LLMs) like ChatGPT do not posses the traits of a person and are not able to author texts as a result. Other approaches, including practice, have been less vehement. To assess the integration of instructions on referencing ChatGPT using APA, top Google hits in the .edu domain were collected and analyzed over a 6-week period from March 14 to April 18, 2023, a time during which official recommendations of the APA Style were finalized. Findings reveal that librarians were quick to provide guidance, but slow to update that guidance, contributing to the potential for misunderstanding the affordances of and best practices for work with LLMs.

5:05pm - 5:20pm
ID: 229 / PS-14: 4
Short Papers
Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting Policies
Topics: Artificial Intelligence (machine learning; text mining; natural language processing; deep learning; value-sensitive AI design; transparent and explainable AI)
Keywords: Digital libraries, multilabel classification, context-dependent language mode, auxiliary data, computational poetry analysis

Computational Thematic Analysis of Poetry via Bimodal Large Language Models

Kahyun Choi

Indiana University Bloomington, USA

This article proposes a multilabel poem topic classification algorithm utilizing large language models and auxiliary data to address the lack of diverse metadata in digital poetry libraries. The study examines the potential of context-dependent language models, specifically bidirectional encoder representations from transformers (BERT), for understanding poetic words and utilizing auxiliary data, such as author's notes, in supplementing poetry text. The experimental results demonstrate that the BERT-based model outperforms the traditional support vector machine-based model across all input types and datasets. We also show that incorporating notes as an additional input improves the performance of the poem-only model. Overall, the study suggests pretrained context-dependent language models and auxiliary data have potential to enhance the accessibility of various poems within collections. This research can eventually assist in promoting the discovery of underrepresented poems in digital libraries, even if they lack associated metadata, thus enhancing the understanding and appreciation of the literary form.

5:20pm - 5:35pm
ID: 144 / PS-14: 5
Short Papers
Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting Policies
Topics: Research into Practice (participatory research; practice-based research; research impact)
Keywords: Metaphors; Autoethnography; ChatGPT; Large Language Models (LLMs)

Using Playful Metaphors to Conceptualize Practical Use of ChatGPT: An Autoethnography

Smit Desai, Michael Twidale

University of Illinois at Urbana-Champaign, USA

In this short paper, we employ a month-long autoethnography to investigate the utilization of ChatGPT through metaphor analysis. We conceptualize three metaphors—unreliable narrator, court jester, and sounding board—that possess the most explanatory capabilities in describing what ChatGPT is, when it can be used, and how it can be helpful. We posit that grounding the use of ChatGPT in metaphors could facilitate discussions and streamline the intricate mechanism of Large Language Models (LLMs). Our study indicates that by proffering playful metaphors as substitutes to apocalyptic and arcane ones, we can enhance the accessibility and comprehensibility of ChatGPT for non-experts and policymakers, thereby potentially contributing to more informed and productive dialogues about the role and potential of LLMs in everyday life.

Mobile View Print View

Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: ASIS&T 2023