Defactoring pace of change: Reviewing computational research in the digital humanities
While Jupyter notebooks are a boon for computational science, they are also a powerful tool in the digital humanities. Matt Burton offers an overview of the digital humanities community, discusses defactoringa novel use of Jupyter notebooks to analyze computational researchand reflects upon Jupyters relationship to scholarly publishing and the production of knowledge.
Talk Title | Defactoring pace of change: Reviewing computational research in the digital humanities |
Speakers | Matt Burton (University of Pittsburgh) |
Conference | JupyterCon in New York 2017 |
Conf Tag | |
Location | New York, New York |
Date | August 23-25, 2017 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
The Jupyter Notebook is an extremely popular tool in academia for teaching, exploratory analysis, and sharing code. While notebooks are increasingly popular among computational scientists, a very different academic community—the digital humanities—loves Jupyter notebooks as well. Matt Burton offers an overview of the digital humanities community, discusses a novel use of Jupyter notebooks to analyze computational research, and reflects upon Jupyter’s relationship to scholarly publishing and the production of knowledge. The digital humanities is a growing community of scholars from humanities disciplines like English and history, whose research, teaching, and publications are infused with digital technology. Some digital humanists leverage computational and data-intensive methods to gain new understandings and perspectives on digitized historical records, such as the books stored in the Hathi Trust. Not only do new computational methods expand our understanding of literature and history; they also expand the very basis of how we know what we know. The formal processes of scholarly publication, especially in the humanities, struggle to accommodate the increasingly multimodal outputs that computational- and data-intensive research produces. As the outputs of academic research become imbricated with code, data, and interpretive prose, how are peers, especially in the humanities, supposed to review computationally inflected work? Matt introduces defactoring, a technique that leverages the expressibility of Jupyter notebooks to computationally interrogate and peer-review the code that is part of digital humanities publications. Literary historians Ted Underwood and Jordan Sellers use machine learning to analyze large corpora of historical texts. One of Underwood and Sellers’s projects, How Quickly Do Literary Standards Change?, uses logistic regression to draw out differences between reviewed and unreviewed poetry volumes published between 1820 and 1919. While Underwood and Sellers’s final analysis has been formally published in an academic journal, the authors graciously conducted their research openly (rare among humanities scholars), posting article preprints on figshare and—more importantly—sharing their code and data on GitHub. However, sharing data and code is only a first step. We need a technique for critically engaging the code and rigorously reviewing its role in publication. Defactoring Pace of Change weaves a computational narrative that simultaneously annotates the code with expository prose and executes Underwood and Sellers’s original analysis. At the core of this effort is the Jupyter Notebook, which affords the blending of code, text, and data into a single documentary form.