Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
Session Overview |
| Session | ||
5.06. Future Tools and Workflows
| ||
| Presentations | ||
Revolutionizing Archival Workflows: Human-Machine Collaboration in the AGI Era School of Cultural Heritage and Information Management, Shanghai University, China Short Description This study explores the integration of human-machine collaboration and advanced technologies to revolutionize archival practices in the AGI era. Using the scientist archives management system as a case study, the research proposes a new workflow that transforms physical archives into actionable archival knowledge, demonstrating how integrating human oversight with machine learning enhances the quality and efficiency of archival management while maintaining trust and authenticity in records. Abstract The surge in archival materials, the growing demand for efficient method of analyzing archival data and the in-depth archives mining have brought about unprecedented challenges to traditional archival workflow. Manual description processes are time-consuming, labor-intensive, and prone to errors, particularly when dealing with large volumes of archival materials. Digital archival data is often unstructured, complicating retrieval and utilization, especially for digitized historical archives and manuscripts, which are complex to recognize, analyze and describe. Moreover, traditional workflows fall short in granular content revelation, failing to meet users' diverse information needs in the new era. As such, using the archival materials of Chen Taosheng, who is the founder and pioneer of modern industrial microbiology in China, as a case study, this project aims to explore a new archival workflow of integrating human-machine collaborative models in managing scientist archives and using advanced technologies such as open-sourced archival database, IIIF (The International Image Interoperability Framework) supported work environment, and LLMs (large language models), especially RAG and GraphGAG. The project aims to revolutionize archival workflows, enhancing the overall efficiency and quality in archival management in the AGI era. In the scientist archival data development workflow project, the main research path is “physical form of archives--digital archival images--semantic archival data--archival knowledge” .The types of archival data in this project include original archival files, old photographs, newspaper clippings, journal articles clippings. The workflow begins with the digitization of physical archives and records, employing high-resolution scanners to convert archival documents, photographs, and other archival materials into digital formats. These files are organized and cataloged using an open-sourced archival management system, ArchivesSpace, where structured metadata and standard archival descriptions to enhance data retrieval and operability. In this process, raw images were integrated through IIIF objects and recorded as Digital Object instances corresponding to physical items. Then IIIF is integrated with OCR text editing plugin in office word environment to allow for the commonly used archival work environment and the seamless linking of raw images with their corresponding OCR texts, ensuring comprehensive documentation and accessibility. Original archival image data is transformed into highly structured, semi-structured Jsonl format data. Traditional manual description tasks are thus converted into data mapping tasks that can be automatically completed by LLMs. Once a high-quality electronic archival database is constructed, it can serve as a private AIGC database for Classic RAG or GraphRAG needed by LLMs, targeting at content mining, content analysis, and other tasks that can reveal the value of archives. The collaboration between humans and machines is crucial in navigating the complexities of the AGI era. Human expertise in archival science ensures the contextual accuracy and ethical handling of records, while machine capabilities significantly expedite the processing and organization of vast amounts of data. This symbiotic relationship enhances the overall workflow, leveraging the strengths of both human intuition and machine precision. Ensuring the accuracy and trust of archive documents requires robust validation mechanisms, where human oversight plays a critical role in verifying the outputs generated by automated systems. Tracing Archival Practices: A Corpus-Based Analysis of Digital Tools and Methods Computational Humanities Group, Leipzig University, Germany Short Description This study maps the landscape of digital tool adoption in archives, aiming to identify both digitally enhanced processes and areas still relying on analog methods. The research analyzes 22,383 German archival texts (2002-2024) using a large language model to classify digital tool mentions into TADiRAH categories. Initial results show emphasis on dissemination and storage tools, with less focus on analysis. Abstract 1. Introduction The ongoing digital transformation has largely impacted archival practices and processes. This raises important questions: Which digital technologies, tools, and methods have been and are currently in use for archival tasks? Where do analog processes continue to dominate, and how might digital tools and methods enhance the efficiency of archivists' work? This study uses computational methods to extract mentions of digital tools and their applications from a corpus of journal articles and blog posts in the archival domain. By examining the past and present state of archival practices through the lens of digital humanities research, this study aims to contribute to the conversation about future opportunities and challenges in the profession. 2. Data This study's dataset includes texts from three German archival sources: two blogs—the VdA blog (Association of German Archivists) and the "Archive 2.0" blog—as well as the journal “ARCHIV. theorie & praxis.” The corpus includes 22,383 document samples written in German and English between 2002 and 2024. 3. Methods This research uses a zero-shot approach with Gemini 1.5 flash, a pre-trained large language model (LLM), to identify references to digital tools and methods in the corpus. The LLM classifies these references into one of seven high-level categories from the TADiRAH taxonomy (https://tadirah.info), providing insights into how archivists and researchers utilize digital tools as described in the texts. Each document is analyzed using a structured prompt that includes a task description, definitions of relevant entities (e.g., “digital tools” and TADiRAH taxonomy classes), and task-specific guidance (e.g., accounting for multiple tool mentions in a single text). The analysis results are stored as JSONs and are then further processed to identify trends and developments of digital tool usage in archival science. 4. Results Preliminary tests conducted on a small subcorpus reveal a bias towards the TADiRAH classes “Disseminating” and “Storing,” while evidence for the class “Analyzing” is minimal. The methodology's scalability allows to trace back the identified tools to their sources, and the LLM’s reasoning for each classification is interpretable. This approach enables analysis of broad trends across the corpus yet also supports detailed examination of individual results. Final results from the full corpus analysis will be presented at the conference. 5. Conclusion This study examines digital tool usage in archival science, with complete findings to be presented at the conference. Using computational analysis, it measures the implementation and current application of digital tools and identifies areas for future development. However, the study's focus on only German-speaking archival texts and its reliance on a zero-shot approach may limit its broader applicability. Despite these constraints, the findings provide valuable insights into digital tool adoption in archival practices and implications for the field's future. This measure is funded by the European Union and co-financed with tax revenues based on the budget approved by the Saxon State Parliament. Structure, process, outcome. A framework for assessing relational and emotional aspects in archival work (working title) IISH, Netherlands, The Short Description This paper proposes a new framework for assessing archival success, inspired by critical archival theory and methodologies to measure (the quality of) healthcare. The goal of this framework is to formulate indicators that value the relational and emotional aspects of archival work often overlooked by traditional metrics. The framework focuses on three key indicators: structure, process, and outcome. Valuing relational and emotional aspects also means different skill requirements for archivists. Abstract In recent years, archival studies has connected archival theory to broader social justice concerns, aligning with discussions in fields such as history, gender studies, critical theory, and heritage studies. Scholars like Laurent & Wright, Caswell, and Lowry emphasize a people-centered approach to archiving. Their work advocates for more equitable practices in archival work, focusing on the impact of records on communities and the relationships between archivists, creators, and users and accommodating different epistemologies and standards. However, despite these shifts in the archival field, many archival institutions continue to rely on assessment indicators such as the number of records processed, meters of material collected, or the number of users. These traditional metrics are not suited to capture the relational, emotional, and affective aspects that have become a focal point in archival practice. Consequently, an alternative framework for measuring archival success is needed—one that emphasizes and values relationships, care, and emotional labor, which are often invisible and undervalued in current assessment frameworks. Drawing inspiration from healthcare delivery models, we propose a framework to measure archival success based on the indicators structure, process, and outcome: 1. Structure: This refers to the relationship between archivists, the archival institution, and the creators or subjects of records, as well as the users. It emphasizes the need for relational care and empathy between these parties, reflecting the influence of Cifor & Caswell's work. 2. Process: This refers to the decision-making power and procedures in archiving, where self-determination for record creators and subjects should be the leading principle. It advocates for group deliberation in the decision-making processes regarding appraisal, description, accessibility and on the decision-making procedure itself. Group deliberation is also about the creation of shared meaning and collective responsibility, by exchanging ideas and beliefs around values and around objectives of recordkeeping. This approach draws from trauma-informed practices, where elements such as creating safe spaces, ensuring transparency, and facilitating informed participation in archival decisions are incorporated. 3. Outcome: This involves safeguarding the integrity of the process by reflecting on it and making power dynamics explicit. This can manifest in practices like provenance tracking and clear documentation of decision-making, ensuring that the roles and decisions of archivists and creators are transparent. In this proposed framework, assessment indicators that measure care and affective relationships within archival work shift archivists’ roles from being decision-makers to facilitators of the decision-making process. Archivists would still perform archival labor but leave decisions regarding appraisal, ordering, and accessibility to the creators or subjects of records. To be well equipped for this role, archivists would need skills traditionally found in fields like social work, communication, and psychology. These skills should be integrated into archival education and part of professional requirements alongside traditional skills like cataloging, appraisal, and description. Embracing this alternative framework for measuring archival success requires a change in both values and practices in archival institutions, one that recognizes the emotional labor inherent in archival work. Geohistorical sources: management, memory and European rights 1Autonomous University of Madrid, Spain; 2International University of Valencia, Spain Short Description The establishment of European states was predicated on an understanding of the territory and its populace, as evidenced by geohistorical sources (cadastres, censuses, etc.). This heterogeneous documentary corpus is intricate to analyse and for which an application has been developed to facilitate the management, connection and georeferencing of these dispersed data. The application employs automated transcription, machine learning and artificial intelligence to preserve historical memory. Abstract Europe's past is inscribed in the annals of archives, records, and cartography, which collectively offer a historical perspective on the evolution of society, property, and the law.Our application has developed an innovative platform that facilitates the collection, transcription, and management of geohistorical sources from the 18th century, offering a comprehensive approach to the study of the past. The automated transcription of handwriting is facilitated by Transkribus, a platform utilising artificial intelligence and deep learning neural networks to accurately convert historical documents into editable digital text through the implementation of text recognition models (HTR - Handwritten Text Recognition).The application of line and character segmentation technology enables the extraction of data, thereby facilitating the mass digitisation of historical records from diverse geographical locations and linguistic backgrounds. The digitisation and structuring of historical data has enabled the reconstruction of families and associated properties, in addition to the calculation of their value at various points in time and across different geographical locations.Furthermore, the geo-referencing of historical cartography using geographic information systems (GIS) has introduced a novel visual and analytical dimension to the study of property history. The present database contains examples from geo-historical sources in Portugal, Spain and Italy, including notary records, land registers and historical censuses, which have been utilised to analyse the socio-economic configuration of the respective periods. This methodological approach facilitates comprehension of the distribution of property in these countries, in addition to analysis of their transformations over time and their impact on the contemporary configuration of property rights and structures. The value of these geohistorical sources lies in their ability to preserve historical memory and provide tools for archival research, heritage management and analysis of the evolution of property rights on the continent. The application, created in JavaScript and PostgreSQL together with PostGIS, facilitates access to historical data and provides a methodological framework for the study of history with cutting-edge digital tools. This framework favours interdisciplinary work between archivists, historians and specialists in digital humanities. The application under discussion connects dispersed data and transforms it into structured information, thereby redefining the management of Europe's documentary heritage. This in turn contributes to a deeper understanding of the past and provides new insights into the history of rights in Europe. The tool thus facilitates access to a living, accessible and useful historical memory for the academic and professional community. | ||