Combative Collaboration: Readers, Literary Influence, and The Little Review
Northeastern University, United States of America
Recent digital scholarship has attended to the pace of historical change (Underwood) and the dynamics of textual influence (Jockers, Barron, et. al.). At the same time, periodical studies has conceptualized the significance of seriality and the ways that time functioned in periodical print culture: as "one issue displaces another, a publication’s editor must avoid too much difference while supplying just the right amount of the same” (Mussell, 345). The seriality of magazines and newspapers has given rise to new questions about literary taste, influence, and causality amongst cultural works, questions that digital tools are well-equipped to explore.
This paper will focus on the influence of readers and their letters to the editors, which had a significant, yet overlooked role in constructing literary taste. I examine the extent of readers’ influence over other genres by topic modeling a corpus of modernist magazines. More specifically, I measure the cosine similarities of topic distributions to determine if any one genre set the tone—or topic—for the following year. This method highlights the various roles in literary production and begins to analyze which role had influence at different stages of cultural creation. Even the avant-garde Little Review, which prided itself for “Making No Compromise with the Public Taste,” regularly acknowledged and corresponded with its readers. Early twentieth century magazines are a site where the contests of literary taste are explicit and continuous. This is of particular importance to literary studies because it begins to show the complex and diverse activities that shaped literary production.
TEI, Transformation, and Text Analysis: Building a Markup-based Toolkit for Word Embedding Models
Northeastern University, United States of America
This paper will share insights gained from building a toolkit that uses text encoding to improve corpus creation for text analysis, with a web interface that is designed for theoretically-grounded experimentation in algorithmic text analysis. The Women Writers Project is currently developing the Women Writers Vector Toolkit (beta link at https://wwp.northeastern.edu/wwo/lab/wwvt/, final version to be published in December 2018), an interface that will allow users to explore several different word embedding models trained on texts from Text Encoding Initiative (TEI) corpora that include Women Writers Online, the Victorian Women Writers Project, and Early English Books Online. Word embedding models are a powerful method for studying relationships between words in large corpora, but training and querying them requires knowledge of a computer programming language, such as Python or R.
This project has two important foci. First, we investigate advanced methods of transforming TEI-encoded texts to improve results for both precision and semantic nuance in text analysis. We are using markup to remove elements that tend to distort results (such as speaker labels in drama), to improve tokenization based on encoding of named entities, and to enhance regularization by preferring elements such as those that mark expansions and corrections. We are also investigating methods of using the semantic distinctions instantiated in the markup to extract subcorpora based on generic features and document structures, enabling comparative analysis of prose and verse, paratextual and textual materials, quoted and non-quoted materials, and so on.
Second, the project is developing a web interface to allow for experimentation on pre-trained models, supported by a wide range of contextual materials, including glossaries and explanations, suggested searches and case studies, and class assignments and activities. The site is designed to open up word embedding models to research and teaching that is grounded in a thorough understanding of how the models operate, without requiring computer programming knowledge.
This project thus tackles two key challenges in digital humanities research: how to integrate text encoding and text analysis and how to make command-line technologies more accessible to novice users and in the classroom.
“Everyone is Gay”: The Presentation of Queer Relationships in Fanfiction
Carnegie Mellon University, United States of America
Authors of fanfiction often explore same-sex relationships and gender expressions outside societal norms, leading researchers to label fanfiction as a “queer space” (Lothian et al., 2007). However, critiques of fanfiction culture posit that despite the commonality of queer relationships in fanfiction, these stories can still further existing heteronormative or cisnormative narratives (Walton, 2018). Does fanfiction as a “queer space” frame queer relationships as the norm? We process thousands of fanfiction stories with techniques from natural language processing to address this question on a large scale.
To analyze framing of queer identity terms throughout fanfiction, we use Word2Vec, a machine learning technique that projects words as vectors into geometric space based on the context words appear in. Analyzing these vectors across specific semantic axes allows a visualization of shifts in semantic associations for identity terms across corpora (An et al., 2018). We train word vectors on fanfiction from the website ArchiveOfOurOwn and compare to a corpus representing "mainstream" fiction, drawn from COCA, Hathi Trust, or other available sources. We plot identity labels on semantic axes from antonym pairs such as same/different, fake/real, and good/bad, based on three dimensions of identity representation in discourse (Bucholtz and Hall, 2004). Preliminary results when compared with a mainstream corpus of Google News vectors show fanfiction vectors for trans, gay and queer projected closer to real than mainstream vectors, which fits fanfiction as a “queer space.” However, we also find surprising associations--for instance, fanfiction vectors for LGBTQ terms such as 'gay' projected closer to 'bad'.
We plan on exploring these findings to determine whether they point to hetero- and cisnormativity lingering in a queer space or just to limitations in surface text analysis. Vectors for explicit mentions of gender and sexual identity miss implicit framing of same-sex relationships where identity labels are not mentioned. ArchiveOfOurOwn metadata tags help identify these cases; we plan on training vectors for metadata using the paragraph2vec technique of Le and Mikolov (2014). We also plan on splitting our fanfiction corpora using this metadata into fics that represent “real-world” challenges to LGBTQ acceptance and stories that present an “aspirational” world in which being queer is already accepted. These techniques will nuance our findings and point to examples of fics that may represent queer relationships in expected or surprising ways.
An, Jisun, Haewoon Kwak, and Yong-Yeol Ahn. “SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment.” Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018. 2450–2461.
Bucholtz, Mary, and Kira Hall. “Theorizing Identity in Language and Sexuality Research.” Language in Society 33 (2004): 469–515.
Le, Quoc V., and Tomas Mikolov. “Distributed Representations of Sentences and Documents.” Proceedings of the 31st International Conference on Machine Learning. 2014. 1188-1196.
Lothian, Alexis, Kristina Busse, and Robin Anne Reid. “Yearning Void and Infinite Potential: Online Slash Fandom as Queer Female Space.” English Language Notes 45.2 (2007): 103–111.
Walton, S. S. (2018). The leaky canon: Constructing and policing heteronormativity in the Harry Potter fandom. Participations: Journal of Audience and Reception Studies 15(1).