Conference Agenda (All times are shown in Eastern Daylight Time)
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
What is a Data Document? Analyzing Four Emerging Data Documentation Frameworks in AI/ML
E. Maemura
University of Illinois Urbana-Champaign, USA
Documentation of datasets is a longstanding concern for data curation and research data management. Additionally, work in AI ethics, fairness, accountability, and transparency has taken on the challenge of documenting and describing datasets used to train or test machine learning models, though with few overlaps or points of intersection with information science approaches. In order to foster increased conversation and collaboration across fields, I aim here to both assess the current landscape of AI’s data documentation frameworks and understand where shared interests with RDM might be possible and fruitful. I analyze four prominent frameworks for documenting AI datasets, considering: (a) their goals, influences and precedents; (b) formal qualities of their materiality, and, (c) noting where and how each framework has been adopted and applied. Results reveal some common features of documentation frameworks, as well as diverging goals and constraints. I close by reflecting on ways that information science might learn from these approaches stemming from AI to inform future work on documentation of datasets for scientific research and beyond.
2:30pm - 2:45pm
Exploring LLM AI in Automatic Generation of Abstracts for Research Publications
Y. Kim1, J. Lee1, S. Yang2
1Kyungpook National University, Republic of Korea; 2Louisiana State University, USA
A well-prepared abstract can help researchers in finding their needed resources by succinctly presenting main points of the study in the paper. However, it is a time- and effort-consuming task to create a quality abstract, which captures important key points of the full manuscript. In this study, we aimed to explore the possibility of using LLM AI as a tool to support authors who would like to draft a quality abstract for a research paper. We compared semantic similarities of abstracts that were prepared by the authors, generated with LLM AI, and the full-text content of 120 papers from ASIS&T 2024 conference. Findings include that different prompt engineering techniques did not generate semantically different abstracts, meaning that the baseline prompts performed well possibly due to the advancement of LLM AI models. Also, experts preferred AI-generated abstracts over the authors’ abstracts when there was semantic discrepancy between the two types of abstracts. This may indicate the usefulness of LLM AI as a tool to support human authors, who may be struggling to draft a quality abstract of their research manuscript.
2:45pm - 3:15pm
Surprising Resilience of Scientific Publication During a Global Pandemic: A Large-Scale Bibliometric Analysis
C. Rusti1, K. Ahrabian1, Z. Wang2, J. Pujara1, K. Lerman1
1Information Sciences Institute, University of Sourthern California, USA; 2Tsinghua University, Beijing , China
Drawing on a global bibliographic corpus covering more than 23 million papers and 10 million disambiguated authors, we present the first longitudinal, institution‑level portrait of how COVID‑19 perturbed research activity and collaboration. Using multilevel regression and interrupted‑time‑series analysis, we trace participation, productivity, and collaboration for researchers at the 1,000 historically most‑productive universities prior to 2020, stratified by geography, field, career stage, and gender. Publication counts and co‑authorship networks surged in 2020, signaling an unexpected, rapid mobilization and resilience of the research system. Yet by late 2022 these metrics had reverted to their pre-pandemic trajectories, indicating that the spike was a short-lived reprioritization rather than a lasting shift. The lag inherent in many experimental pipelines – especially wet‑lab science – raises the prospect of delayed losses not yet visible within our time frame. Our study establishes an evidence‑based baseline for monitoring longer‑term effects and offers actionable insights for science‑policy makers seeking to safeguard research capacity during future global crises.