Conference Agenda (All times are shown in EDT)

Paper Session 17: Archives, Curation and Preservation [SDGs 1-17]
Wednesday, 28/Oct/2020:
9:00am - 10:30am

Session Chair: Jeonghyun Kim, University of North Texas, United States of America

9:00am - 9:15am
ID: 164 / PS-17: 1
Long Papers
Topics: Archives; Data Curation; and Preservation
Keywords: Reproducibility, data curation, data reuse, data sharing, earth system science

Cross-Disciplinary Data Practices in Earth System Science: Aligning Services with Reuse and Reproducibility Priorities

An Yan, Caihong Huang, Jian-Sin Lee, Carole Palmer

University of Washington, USA

As a data intensive field that unites researchers from many disciplines, Earth System Science (ESS) is an ideal site for examining evolving cross-disciplinary data practices. This paper reports on results from a survey examining data sharing, data reuse, and research reproducibility practices of ESS researchers, aimed at informing improvements in data services for interdisciplinary sciences. Data reuse was found to be very high for new and comparative analyses but very limited for reproducing research. Data sharing was also strong, mostly through supplements to published papers, with moderate use of open access repositories. At the same time, there was interesting variability in both data sharing and reuse among ESS disciplines. The most pronounced challenges to reuse and reproducibility stem from limited documentation on how data are collected and managed, practices that are poorly supported by institutions, funders, and publishers. A more refined approach to “reproducibility” is needed that aligns with priorities and practices within the research community. Just as importantly, advances in data service models for ESS and other interdisciplinary fields need to account for the diverse and distributed system of repositories and require building a workforce with deeper knowledge of the complex data and methods that drive integrative systems science.

9:15am - 9:30am
ID: 184 / PS-17: 2
Long Papers
Topics: Archives; Data Curation; and Preservation
Keywords: Emulation practices, software preservation, access

Emulation Encounters: Software Preservation in Libraries, Archives, and Museums

Amelia Acker

University of Texas at Austin, USA

This paper reports on early findings of research in 2019 following 3 small teams of information professionals as they implemented emulation strategies into their day-to-day work at a museum, a university research library, and a university research archive and technology lab. Findings are reported from workplace observations and semi-structured interviews with preservationists (N=25) as they implement software emulation programs in cultural heritage institutions that collect and preserve software for access. Results suggest that the distributed teams in this cohort of preservationists have developed different emulation practices for particular kinds of “emulation encounters” in supporting different types of use and users. I discuss the implications of these findings for digital preservation research and emulation initiatives providing access to software or software-dependent objects, showing how implications of these findings have significance for those developing software preservation workflows and building emulation capacities. This article suggests that there are different emulation practices for preservation, research access, and exhibition undertaken by preservationists in libraries, archives, and museums; and in examining particular visions of access these findings call into question software emulation as a single, static preservation strategy for cultural heritage institutions.

9:30am - 9:45am
ID: 279 / PS-17: 3
Long Papers
Topics: Archives; Data Curation; and Preservation
Keywords: Software Curation, Software Sustainability, Open-source software, Data Curation

Finite and Infinite Games: An Ethnography of Institutional Logics in Research Software Sustainability

Nicholas Weber

University of Washington, USA

Modern research is inescapably digital, with data and publications most often created, analyzed, and stored electronically, using tools and methods expressed in software. While some of this software is general-purpose office software, a great deal of it is developed specifically for research, often by researchers themselves. Research software is essential to progress in science, engineering, and all other fields, but it is often not developed, shared, or stored in a sustainable way. The following paper presents findings from an ethnography of two research software projects that have, over the last ten years, cooperatively organized development efforts to produce important software enabling scientific breakthroughs in both astronomy and macromolecular modeling. The work of these two projects are framed in terms of James Carse’s model of finite and infinite games. I argue that by incentivizing institutional governance that resembles the design of an infinite game, funding agencies can increase the sustainability of research software and improve various aspects of data-driven scientific discovery.

9:45am - 10:00am
ID: 324 / PS-17: 4
Long Papers
Topics: Archives; Data Curation; and Preservation
Keywords: long tail, data curation, topic analysis, research funding, astronomy

Mapping The “Long Tail” Of Research Funding: A Topic Analysis Of NSF Grant Proposals In The Division Of Astronomical Sciences

Gretchen Renee Stahlman1, P. Bryan Heidorn2

1Rutgers University, School of Communication & Information, USA; 2University of Arizona, School of Information, USA

“Long tail” data are considered to be smaller, heterogeneous, researcher-held data, which present unique data management and scholarly communication challenges. These data are presumably concentrated within relatively lower-funded projects due to insufficient resources for curation. To better understand the nature and distribution of long tail data, we examine National Science Foundation (NSF) funding patterns using Latent Dirichlet Analysis (LDA) and bibliographic data. We also introduce the concept of “Topic Investment” to capture differences in topics across funding levels and to illuminate the distribution of funding across topics. This study uses the discipline of astronomy as a case study, overall exploring possible associations between topic, funding level and research output, with implications for research policy and practice. We find that while different topics demonstrate different funding levels and publication patterns, dynamics predicted by the “long tail” theoretical framework presented here can be observed within NSF-funded topics in astronomy.