Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
Session Overview |
| Session | ||
4.14. Intelligent Archives: From AI Tools to Theoretical Models for Enhanced Access
| ||
| Presentations | ||
Application of Artificial Intelligence in Archival Openness Review by Provincial Archives in China : Current Situation, Challenges and Optimization Strategies Shanghai University, China, People's Republic of Short Description Applying AI to archival openness reviews is a crucial topic in need of in-depth exploration.This study uses literature review and telephone interviews to research 17 representative provincial archives in China. Analysis shows these archives have reference-worthy experiences in AI-applied archival openness reviews, but also encounter challenges. The research puts forward multi-dimensional countermeasures. Abstract [Purpose/Significance]The development of the new generation of information technology has led to an increasingly growing demand for archival information among people. There is an urgent need to expand the openness of archives and enhance the effectiveness of archival openness. In this context, the application of artificial intelligence in the archival openness review has gradually become an industry trend. Therefore, the tasks of promoting the scientific application of this technology and reflecting on the necessity and feasible approaches for applying artificial intelligence into archival openness reviews have become pressing topics requiring thorough exploration. [Method/Process]This research comprehensively employs methods of literature review and telephone interviews to investigate 17 representative provincial archives in China, focusing on aspects such as the existing review work mechanisms, review methodologies, current technology application status, review efficiency, and review quality. Then, combined with the current willingness of archivists to apply technology and future plans, it comprehensively analyzes the experiences that can be drawn upon and the existing problems in the application of artificial intelligence in the archival openness review. [Result/Conclusion]Among the 9 provincial archives that use artificial intelligence, 4 tests show that the accuracy rates of 2 AI reviews exceed 90%, but 5 have not announced their accuracy rates, perhaps because they have not reached the ideal standards. Overall, the review efficiency is related to the working mechanism and review method. Archives that use artificial intelligence to assist review generally have relatively higher efficiency. Measured by the number of reviewed archives, there are 5 provincial archives with more than 1 million reviewed archives, 1 with more than 200,000, and 11 with less than 200,000. Given the heavy review tasks and backlogs, there is still ample opportunity for improvement in the review efficiency of provincial archives.The analysis reveals that provincial archives have experiences worthy of reference in applying artificial intelligence to archival openness review, yet they also face challenges such as high costs of technology application, AI trust deficit, limited promotion and application, and lack of professional qualities. Therefore, from both broad and narrow perspectives, this research proposes multi-dimensional countermeasures, including alleviating cost pressures in multiple ways, building AI trust through inter-regional collaboration, integrating and learning from each other's technologies, and cultivating talents in all aspects. These countermeasures aim to provide feasible ideas for enhancing the quality and efficiency of archival openness review and for the scientific and steady application of artificial intelligence.This study can offer references and suggestions for provincial archives to apply artificial intelligence technology in archival openness review, and provide valuable references and theoretical insights for subsequent applications in this context. Structured Data Extraction from Hungarian Civil Registers National Archives of Hungary, Hungary Short Description This paper presents a workflow for extracting data from Hungarian handwritten civil registers (1895–1990), combining machine learning models with rule-based algorithms. Several postprocessing methods are used to enhance data accuracy, supporting structured extraction and search. A key feature is machine-assisted record linkage, enabling life course reconstruction and family reconstitution. Abstract This paper presents a workflow designed for extracting information from Hungarian handwritten civil registers recorded between 1895 and 1990. These vital records extend over nearly a century and provide valuable data for a wide range of archival research, including both amateur genealogists and academic historians studying demography, linguistics, economics, and social sciences. The workflow processes scanned images and extracts data in a structured format. It incorporates a combination of rule-based algorithms and machine learning models, optimized for each specific task. The key steps are the page classification, segmentation, text line detection, handwritten text recognition, Before the final step, database migration and various post-processing actions are carried out to ensure data quality, where relationships between individuals are established through machine-assisted record linkage. The paper presents also the experiences, solutions and dead ends encountered during the development process in a world that requires daily adaptation to new and emerging solutions in the IT sector. We report on how the focus of development shifted from the initially challenging task of handwriting recognition to post-processing and service building. Through concrete examples, we demonstrate how the resulting content enhances access to archival materials and contributes to open science. Following closely similar research efforts in Quebec, Barcelona, and Salt Lake City, we see the strengths and weaknesses of our solution. The uniqueness of the work carried out by the team of linguists, archivists, and developers lies in the combination of machine learning models and traditional rule-based algorithms, allowing for a more adaptive extraction of structured data. Unlike other initiatives that extract named entities and relationships from unstructured text with the help of large language models, we cannot use the context but must rely only the immense logic of the data. Record linkage is also a key element. By connecting individual records across databases enables the reconstruction of life courses and revitalizes family reconstitution research. As for a large archive flexibility is also crucial, the system can later be adapted to process other structured sources, such as forms and index cards. About »Instantiation« and »Aggregation« in Modern Archival Theory and Practice 1Historical Archives Ljubljana, Slovenia; 2Regional Archives Maribor Short Description By discussing “instantiation” and “aggregation” we want to analyse and define the meaning of both in various contexts of archival professional work. We perceive the direct usefulness of introducing both terms in the areas of appraisal and arrangement of archival material and in the creation and acquisition of archival packages, as well as in the description of archival material, the design of meta-data structures and their embedding in one or more-dimensional meta-data tectonics. Abstract By discussing instantiation and aggregation in archival theory and practice, we want to analyse and define the meaning of the terms "instantiation" and "aggregation" in various contexts of archival professional and scientific research work. On this basis, we will integrate both terms into the doctrine of modern archival theory and practice and thereby contribute to the development and more intensive implementation of methods of archival professional work, which are necessary both in the processes of instantiation and in the processes of aggregation. The aforementioned discussion is important from the perspective of introducing the aforementioned terms and their meaning in the new conceptual model Records in Context, which will bring changes in the field of describing archival materials and creating relations between individual entities represented in archival records. We perceive the direct usefulness of introducing both terms in the areas of theoretical and practical appraisal and arrangement of archival material and in the creation and acquisition of archival packages, as well as in the description of archival material, the design of meta-data structures and their embedding in one or more-dimensional meta-data tectonics. We perceive the indirect usefulness in the segments of retrieval of information in archival databases and the related procedures of designing and acquiring DIP packages. As basic research method we will use case studies with typical or atypical, and one or more variant instantiation of archival material and the related linear or hierarchical aggregation of archival entities. The definitions of both supporting concepts and their derivatives will be tested through the analysis of online sources and the analysis of records in the Slovenian Public Archival Database (SJAS) and in other national and international archival databases. The results of the research will be presented using the descriptive method. Examples of archival material used for testing the suitability of the solutions will be analysed using the historical method, while we will also implement the empirical method. To create the introductory and concluding chapters, we will use the method of summarizing the contents together with the descriptive method. The insights that we will gain on the basis of the research will be abstracted to the extent that they can be directly implemented in various methods of archival professional work. At the same time, these terminological, methodological and theoretical solutions will be directly useful for upgrading or improvement of existing methods of archival professional work in the segments of recognizing the completeness of archival materials, their evaluation and placement in existing and new hierarchical structures of archival materials. Indirect results are also expected in clearly expressed requirements for improvements and possible upgrades of solutions in the field of digitalization of existing and future archival professional procedures. Knowledge regarding the theory, procedures and terminology regarding instantiation and aggregation of archival entities will ensure a better understanding and thus treatment of the contents of both concepts, including the implementation of procedures and theoretical considerations in the background. Estrategias de adaptación de productores al RiC en archivos históricos: el caso del Archivo Histórico Nacional (España). Archivo Histórico Nacional. Subdirección General de los Archivos Estatales, Ministerio de Cultura (España) Short Description El RiC ha constituido un hito en el desarrollo de nuevos modelos de descripción. Su adopción por los archivos nacionales es un reto que crece en el caso español por las dificultades surgidas en la aplicación de normas anteriores. El AHN ha abordado un proyecto para revisar sistemáticamente los agentes implicados en la producción de los fondos que conserva conforme al modelo conceptual del ICA, de modo que sirva para corregir algunas anomalías y ofrecer una descripción contextual más rica y útil Abstract La publicación de RiC ha supuesto un punto de inflexión en el desarrollo de nuevos modelos de intercambio de la información descriptiva. Para aprovechar las oportunidades que abre, los archivos deben afrontar un gran desafío para adaptar sus sistemas de información y sus procedimientos de trabajo. En el caso de España, los Archivos Estatales del Ministerio de Cultura cuentan con fortalezas como una herramienta tecnológica consolidada, el arraigo en el uso de la normativa descriptiva del ICA y la experiencia en el planteamiento y aplicación de modelos conceptuales como NEDA-MC. Pero también existen debilidades que deben ser tenidas en cuenta, siendo las más relevantes la falta de un desarrollo nacional unificado de las normas de descripción, la confusión en la aplicación algunos puntos de éstas y las dificultades que plantean las peculiaridades de la tradición archivística española. La implantación de los modelos conceptuales de descripción supone una oportunidad para solucionar muchos de los problemas que han generado estas debilidades. La diversidad de prácticas descriptivas, incluso en un archivo de largo recorrido como el AHN, obliga a planificar estrategias consistentes que permitan que la aplicación del RiC sea coherente y facilite la interoperabilidad y la reutilización automatizada de la información. Para ello, el AHN ha planteado un proyecto sobre la base de décadas de trabajo en descripción normalizada y la gran variedad de fondos que conserva, con historias archivísticas y modalidades de ingreso muy diferentes que han tenido como consecuencia una asignación de productores no siempre consistente. Este proyecto plantea una revisión sistemática de los registros de macrodescripción, profundizando en las historias archivísticas de cada agrupación, identificando claramente los agentes implicados y determinando qué relaciones los vinculan. En definitiva, revisar la identificación de fondos para analizar los motivos de posibles anomalías en la asignación de productores. La información resultante permitirá evaluar las posibilidades de adaptar cada agente y sus relaciones de producción conforme a las opciones contempladas en el RiC, así como explicitar la vinculación con otros agentes implicados en la génesis, acumulación, conservación y transmisión de cada fondo, permitiendo una formalización más rica y flexible de las descripciones de los documentos en su contexto. En la ponencia que se propone para el Congreso Internacional del ICA Barcelona 2025 se explicará detalladamente la metodología adoptada para desarrollar este proyecto y se ofrecerán ejemplos concretos de agrupaciones documentales del patrimonio histórico documental conservado en el AHN. | ||