Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

 
 
Session Overview
Session
Making and Breaking Digital Texts I
Time:
Thursday, 04/Dec/2025:
1:30pm - 3:00pm

Location: Roland Wilson Building | 3.04 Seminar Room 3 (30)


Show help for 'Increase or decrease the abstract text size'
Presentations

Democratising historical research with AI and vector search: a case study of the Australian Joint Copying Project

David Charles Goodman, Daniel Roy Russo-Batterham

University of Melbourne, Australia

For almost 50 years (1948 to 1997) the Australian Joint Copying Project microfilmed documents in the UK relating to the history of Australia, New Zealand and the Pacific. By the end of its work, the Project had produced 10,419 microfilm reels covering 8 million different documents created between 1560 and 1984. Then between 2017 and 2020, the National Library of Australia digitised this massive and important collection, including more than 10,000 pages of descriptive text and finding aids. Simple metadata search allows basic navigation of the collection. For experienced academic researchers who know what they are doing and what they want to look at that may be all that is needed. But for others, this massive historical collection remains effectively hidden from view and thus its broader research potential unrealised.

Recent advances in Optical Character Recognition and document layout detection offer exciting opportunities for extracting structured text from materials such as these, despite their variable quality and mixed formats (including a great deal of handwriting). Vector databases and associated technologies allow searches based on semantics rather than on the orthography of the search term. This key advance allows researchers to trawl the material for ideas or concepts as well as for precise utterances, potentially democratising access. For example: “the woman question” gets 490 hits in Trove newspapers 1880-1890, “sex equality” 353, but the high school history student who searched for “gender equality” or “sexism” would find nothing in that decade. By leveraging Large Language Models, we can make this research process even more user-friendly by allowing queries to be posed in natural language and results summarised. Opening up at least some of the riches of the AJCP collections to a broader range of researchers – including Pacifica and Indigenous researchers, family historians and citizens with a broad but less trained interest in the past – would we think be a worthy aim. We look forward to reporting on progress and demonstrating our results.



Storyloom: South Asian Digital Storytelling in praxis

Preetha Mukherjee

Indian Institute of Technology, India

Digital storytelling, as the name suggests, refers to the digital intervention in the production and dissemination of stories across various digital platforms. My ongoing research attempts to build a digital storytelling platform 'Storyloom' that aims to digitize the South Asian folk stories drawn from the migration literature of the subcontinent. After a brief outline of my project, I will be looking into the current digital storytelling landscape of India arguing for the need to create India’s own digital cultural record. The study builds on the existing scholarship on digital storytelling and defines its characteristic elements, which make it stand out in the plethora of current digital content. The case studies are looking at projects from 2010-2021 that retrieve and digitize stories of the country’s history and cultural heritage. Though, these digital stories are a remarkable attempt in unraveling the voices of the past, the study recognizes the research gaps. Storyloom, a digital storytelling platform is built on the theoretical framework that define digital engagement with stories in India, retaining its indigeneity. The project contributes in curating a new niche of digital cultural record that suits the diverse nature of stories in South Asia.



Bridging Data Islands: Contextual Digital Practices for Diasporic and Embodied Archives

Jingdan Zhang

Australian National University, Australia

Cultural data from diasporic and Indigenous communities is increasingly digitised, yet often disconnected from its original contexts of meaning. While digital platforms preserve content, they frequently fail to maintain the interpretive frameworks, usage protocols, and community-specific significance that animate these materials. Such disconnects risk transforming living archives into inert repositories. Addressing this challenge, this paper explores the development of a “context-sensitive metadata bridge”—a model designed to support the cultural continuity of community-led digital archives. Drawing on Alliata et al. (2024)’s framework for embodied archives and the relational AI model proposed by Brown, Whaanga, and Lewis (2023), the proposed structure enables layered, place-based metadata annotation and community-authored narrative framing. This approach prioritises sustainability not only in technical terms but also as ethical care and epistemic resilience. The model is not intended to enforce uniformity, but rather to enable frictional, respectful dialogue between cultural data clusters, fostering a distributed yet meaningful network of archival knowledge. Situating this work within ongoing debates in digital curation, data sovereignty, and sustainable infrastructure design, the paper contributes to emerging efforts in digital humanities to build tools and systems that preserve both information and the values embedded within it.



PROGRAMMING "JOYCEWARE": How poststructuralism invigorates digital analysis of Ulysses

Jasper Harrington

University of Melbourne, Australia

There is an emerging trend in scholarship which aims to draw links between poststructuralism and digital text analysis. In his revisionist history Code: From Information Theory to French Theory, Bernard Dionysius Geoghegan suggests the poststructuralist movement fulfilled “the promise of a theoretically rigorous approach to communicative codes” (2) made by advances in cybernetics. N. Katherine Hayles also stamps the poststructuralist seal of approval on foundational cybernetic methods such as “the null strategy” (644) which assumes texts generated by humans and machines are indistinct. In any case, the poststructuralist conceptualisation of language has an uncanny tendency to share technical nomenclature with digital text analysis. But it is in the corpus of James Joyce, and particularly Ulysses, that many of these terms and methods meet in apposition. This paper will substantiate the link between poststructuralism, Joyce’s lexicon and digital text analysis. It will begin with a brief overview of the poststructuralist account of language. The lens will then pivot to Joyce, and articulate why poststructuralism offers a viable theoretical framework for a critical reading of Ulysses. It will then attempt to superimpose the lexicon and methodology of digital text analysis onto the poststructuralist account of Joyce. This effort will be empowered by elaborating the metaphor first proposed by Derrida that Ulysses and Finnegans Wake are a “1000th generation computer” (147). This conceptual model of language as it inheres to poststructuralism and digital text analysis will provide impetus to foreground ULEXIS; a digitised lexical companion to Ulysses.

Works Cited

Derrida, Jacques. “Two words for Joyce”. Post-structuralist Joyce: Essays from the

French. Cambridge: Cambridge University Press, 1984, pp.145-161.

Geoghegan, Bernard Dionysius. Code: From Information Theory to French Theory.

Sign, Storage, Transmission. Durham; London: Duke University Press, 2023.

Hayles, N. Katherine. “Inside the Mind of an AI: Materiality and the Crisis of

Representation.” New Literary History 54, no. 1, 2023, pp. 635–66.



EREA: Enhanced Research Exploration and Analysis

Ruibiao Zhu

The Australian National University, Australia

The increasing volume of scientific publications poses challenges for researchers in efficiently identifying relevant literature, synthesizing research trends, and exploring emerging ideas. Manual search and analysis processes are time-consuming and often insufficient for capturing complex citation relationships. This project presents an open-source Python-based system, EREA(Enhanced Research Exploration and Analysis), that integrates generative artificial intelligence, automated information retrieval, semantic vector search, and citation-based visualization to support enhanced research exploration. User-defined queries are processed to extract structured keywords, retrieve scholarly articles from Google Scholar, and supplement metadata using OpenAlex. Retrieved data are structured, and embedded in a vector database for semantic retrieval, and visualized through interactive, offline HTML graphs. A research report is generated through large language model-assisted synthesis. Developed according to the FAIR (Findable, Accessible, Interoperable, Reusable) principles, the system accelerates research exploration, provides structured thematic insights, facilitates understanding through visual citation networks, and supports the identification of research gaps and future directions.