4:00pm - 4:15pmID: 119
/ PS-14: 1
Short Papers
Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting PoliciesTopics: Domain-Specific Informatics (cultural informatics; cultural heritage informatics; health informatics; medical informatics; bioinformatics; business informatics; crisis informatics; social and community informaticsKeywords: Borrowing, copyright, data science, music, note patterns
Detection of Musical Borrowing Using Data Science
Steven Walczak1, Thomas Moore-Pizon. Jr.2
1University of South Florida, USA; 2Kaiser University, University of South Florida, USA
Data science may be used to determine similarities between musical scores. Programs are written in C++ to capture note progressions from musical scores and to compare progressions from different songs to identify overlapping areas. These tools enable the study of musical borrowing across musical genres and may assist in copyright violation cases. Results indicate that within the Celtic music genre, borrowing occurs across greater than 10% of the songs.
4:15pm - 4:40pmID: 376
/ PS-14: 2
Long Papers
Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting PoliciesTopics: Data Science; Analytics; and Visualization (data science; data analytics; data mining; decision analytics; social analytics; information visualization; images; sound)Keywords: Citation predication, Team composition, Team structure, XGBoost, Explainable AI, SHAP
Using Explainable AI to Understand Team Formation and Team Impact
Huimin Xu1, Min Song2, Maytal Saar-Tsechansky1, Ying Ding1
1The University of Texas at Austin, USA; 2Yonsei University, South Korea
The citation of scientific papers is considered a simple and direct indicator of papers’ impact. This paper predicts papers’ citations through team-related variables, team composition, and team structure. Team composition includes team size, male/female dominance, academia/industry collaboration, unique race number, and unique country number. Team structures are made up of team power level and team power hierarchy. Team members’ previous citation number, H-index, previous collaborators, career age, and previous paper numbers are a proxy of team power. We calculated the mean value and Gini coefficient to represent team power level (the collective team capability) and team power hierarchy (the vertical difference of power distribution within a team). Taking 1,675,035 CS teams in the DBLP dataset, we trained the XGBoost model to predict high/low citation. Our model has reached 0.71 in AUC and 70.45% in accuracy rate. Utilizing Explainable AI method SHAP to evaluate features’ relative importance in predicting team citation categories, we found that team structure plays a more critical role than team composition in predicting team citation. High team power level, flat team power structure, diverse race background, large team, collaboration with industry, and male-dominated teams can bring higher team citations.
4:40pm - 5:05pmID: 175
/ PS-14: 3
Long Papers
Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting PoliciesTopics: Libraries (librarianship; libraries; museums; other cultural institutions; information services; scientific and technical information; technology in libraries)Keywords: Large language models (LLMs); ChatGPT; authorship; attribution; library webpages.
What Is a Person? Emerging Interpretations of AI Authorship and Attribution
Heather Moulaison-Sandy
University of Missouri, USA
As of spring 2023, the scholarly community has been eager to explore how AI-produced content should be integrated into both academic writing and scholarly publishing. This paper investigates the prevailing responses to the introduction of ChatGPT in November 2022 and the interest that has been afforded it by both the academy and the publishing industry. A review of the published literature on aspects of ChatGPT authorship was carried out, finding that government and the publishing industry have unequivocally asserted that large language models (LLMs) like ChatGPT do not posses the traits of a person and are not able to author texts as a result. Other approaches, including practice, have been less vehement. To assess the integration of instructions on referencing ChatGPT using APA, top Google hits in the .edu domain were collected and analyzed over a 6-week period from March 14 to April 18, 2023, a time during which official recommendations of the APA Style were finalized. Findings reveal that librarians were quick to provide guidance, but slow to update that guidance, contributing to the potential for misunderstanding the affordances of and best practices for work with LLMs.
5:05pm - 5:20pmID: 229
/ PS-14: 4
Short Papers
Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting PoliciesTopics: Artificial Intelligence (machine learning; text mining; natural language processing; deep learning; value-sensitive AI design; transparent and explainable AI)Keywords: Digital libraries, multilabel classification, context-dependent language mode, auxiliary data, computational poetry analysis
Computational Thematic Analysis of Poetry via Bimodal Large Language Models
Kahyun Choi
Indiana University Bloomington, USA
This article proposes a multilabel poem topic classification algorithm utilizing large language models and auxiliary data to address the lack of diverse metadata in digital poetry libraries. The study examines the potential of context-dependent language models, specifically bidirectional encoder representations from transformers (BERT), for understanding poetic words and utilizing auxiliary data, such as author's notes, in supplementing poetry text. The experimental results demonstrate that the BERT-based model outperforms the traditional support vector machine-based model across all input types and datasets. We also show that incorporating notes as an additional input improves the performance of the poem-only model. Overall, the study suggests pretrained context-dependent language models and auxiliary data have potential to enhance the accessibility of various poems within collections. This research can eventually assist in promoting the discovery of underrepresented poems in digital libraries, even if they lack associated metadata, thus enhancing the understanding and appreciation of the literary form.
5:20pm - 5:35pmID: 144
/ PS-14: 5
Short Papers
Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting PoliciesTopics: Research into Practice (participatory research; practice-based research; research impact)Keywords: Metaphors; Autoethnography; ChatGPT; Large Language Models (LLMs)
Using Playful Metaphors to Conceptualize Practical Use of ChatGPT: An Autoethnography
Smit Desai, Michael Twidale
University of Illinois at Urbana-Champaign, USA
In this short paper, we employ a month-long autoethnography to investigate the utilization of ChatGPT through metaphor analysis. We conceptualize three metaphors—unreliable narrator, court jester, and sounding board—that possess the most explanatory capabilities in describing what ChatGPT is, when it can be used, and how it can be helpful. We posit that grounding the use of ChatGPT in metaphors could facilitate discussions and streamline the intricate mechanism of Large Language Models (LLMs). Our study indicates that by proffering playful metaphors as substitutes to apocalyptic and arcane ones, we can enhance the accessibility and comprehensibility of ChatGPT for non-experts and policymakers, thereby potentially contributing to more informed and productive dialogues about the role and potential of LLMs in everyday life.
|