Session | ||
Paper Session 21: Knowledge Organization and Cultural Analytics
| ||
Presentations | ||
11:30am - 11:45am
ID: 265 / PS-21: 1 Short Papers Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting Policies Topics: Archives; Data Curation; and Preservation (archives; records; cultural heritage materials; digital data curation; digital libraries; digital humanities) Keywords: knowledge organization, Chinese ancient book, ontology, system design, digital humanities Using Ontology to Organize Chinese Ancient Books in the Digital Age Peking University, People's Republic of China The digitization, curation, and utilization of Chinese ancient books are crucial to the digital humanities. Despite progress in these areas, issues with data interoperability, data sharing, and data linkage persist due to a lack of standardized annotated ancient corpus and a general description framework for ancient books. To overcome these challenges, this paper proposes an ontology-based description framework that integrates catalogs of Chinese ancient books from various institutions, creating a standardized, interpretable, and researchable knowledge base. The framework combines general standards with unique ancient book characteristics, revealing complex relationships between books and books, books and people, and books and times, providing a more comprehensive understanding of the knowledge contained within ancient books. Additionally, this paper applied the framework to The National Rare Ancient Book Directory, a catalog containing 13,026 books from over 400 institutes, to develop an interactive system. The system is available at https://rarebib.pkudh.org/. Our results demonstrate that the framework standardizes data and provides a sophisticated and nuanced understanding of the knowledge within ancient books. This has noteworthy implications for individuals engaged in research, scholarship, and reading in the digital age. 11:45am - 12:10pm
ID: 337 / PS-21: 2 Long Papers Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting Policies Topics: Archives; Data Curation; and Preservation (archives; records; cultural heritage materials; digital data curation; digital libraries; digital humanities) Keywords: Person-oriented ontology, Biographical ontology, Digital humanities, Metadata crosswalk Person-Oriented Ontologies Analysis for Digital Humanities Collections from a Metadata Crosswalk Perspective 1University of Melbourne, Australia; 2RMIT University, Australia Mapping between different representations of similar data is a common challenge in digital humanities (DH). In practical DH collections, the ‘person’ is an essential and centric unit and other parts could link to the ‘person’ to form the knowledge base. However, there is still no general and useful person-oriented ontology in DH community. Many practical DH projects have developed their own ontologies by DH experts, but these ontologies are not interoperable. Therefore, it is important to explore existing biographical ontologies and develop a comprehensive person-oriented ontology for DH. Using the metadata crosswalk method, we examined the ontologies provided for persons in three DH collections to analyze how they map onto standard ontologies such as FOAF (friend of a friend). This paper uncovers a significant and consistent gap between standard biographical ontologies and those used in practical DH collections, arriving at a set of heterogeneous problems, including different granularities of metadata. Consequently, we propose three key person-oriented ontological types of elements, drawing on this metadata crosswalk: basic biographical elements, relational elements, and explanatory elements (such as career, connected with role and time). This metadata crosswalk provides a foundation for future matching between person-oriented ontologies and facilitates semantic interoperability between DH collections. 12:10pm - 12:35pm
ID: 166 / PS-21: 3 Long Papers Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting Policies Topics: Informetrics and Scholarly Publishing (bibliometrics; infometrics; scientometrics; altmetrics; open science; scholarly communication and new modes of publishing; measurement of information production and use) Keywords: Interdisciplinary Prediction, Interdisciplinary Topic, Co-word Network, Link Prediction, Digital Humanities Interdisciplinary Topic Link Prediction Based on Co-Word Network: A Case Study on Digital Humanities 1Renmin University of China, People's Republic of China; 2University of Hong Kong, People's Republic of China Interdisciplinary research plays a crucial role in addressing complex challenges in science, technology, and society. Predicting interdisciplinary links between topics can unveil potential interdisciplinary relationships and foster innovation. Considering topics extracted from interdisciplinary research as interdisciplinary topics, we predict the potential links among them based on their co-word network, and we propose integrating topic semantic content features, author direct-collaboration features, and indirect-collaboration features to improve prediction accuracy. Based on Graph Convolutional Networks (GCN), Graph Attention Networks (GAT), GraphSAGE, Bert, and Node2vec, interdisciplinary topic link prediction models are constructed. We use digital humanities as a case study and our experimental results show that the integration of semantic content, direct-collaboration, and indirect-collaboration features significantly enhances the Area Under the Curve (AUC) and Average Precision (AP) performance, outperforming predictions based solely on the co-word network. The predicted results provide valuable research directions and references for digital humanities scholars, with examples in the fields of cultural heritage and historical geographic information systems. 12:35pm - 12:50pm
ID: 324 / PS-21: 4 Short Papers Confirmation 1: I/we acknowledge that all session authors/presenters have read and agreed to the ASIS&T Annual Meeting Policies Topics: Data Science; Analytics; and Visualization (data science; data analytics; data mining; decision analytics; social analytics; information visualization; images; sound) Keywords: named entity recognition, machine learning, data science, cultural analytics, Native American studies Tuning out the Noise: Benchmarking Entity Extraction for Digitized Native American Literature 1University of Illinois at Urbana-Champaign, USA; 2University of Oklahoma, USA; 3Dartmouth College, USA; 4University of Sheffield, UK; 5Indiana University, USA Named Entity Recognition (NER), the automated identification and tagging of entities in text, is a popular natural language processing task, and has the power to transform restricted data into open datasets of entities for further research. This project benchmarks four NER models–Stanford NER, BookNLP, spaCy-trf and RoBERTa–to identify the most accurate approach and generate an open-access, gold-standard dataset of human annotated entities. To meet a real-world use case, we benchmark these models on a sample dataset of sentences from Native American authored literature, identifying edge cases and areas of improvement for future NER work. |