1. On Causal Inference in Humanistic Research
Reading Thad Dunning’s Natural Experiments in the Social Sciences (Cambridge, 2012) I am particularly struck by his discussion of study design. “How can causal inference be improved?” he asks on page 4 and answers: “In seeking to answer such questions, I place central emphasis on natural experiments as a ‘design-based’ method of research — one in which control over confounding variables comes primarily from research-design choices, rather than ex post adjustment using parametric statistical models.”1
This approach seems particularly well-suited for computational study in the humanities, where the veracity of causal and statistical assumptions are often difficult to explicate and defend — let alone validate. The natural experiment approach seeks to shift reasoning about such assumptions from the statistical modeling phase of research to the design process, expressed in the logic of the design itself. In short, it is the research design, rather than the statistical model, that does the heavy lifting.
2. Design as Argument
For this reason, Dunning writes, “substantive and contextual knowledge plays an important role at every stage of natural-experimental research — from discovery to analysis to evaluation.” The emphasis on context necessitates thinking about statistical concepts such as “effect” in ways that are grounded in the specific conditions of the historical or literary problem under study.2
This has immediate implications for how we frame computational humanities projects. Too often, the adoption of statistical or machine-learning methods in literary and historical research is understood as a gesture toward scientific legitimacy — a borrowing of method that preserves the assumption structure of the original social-scientific context while discarding the disciplinary safeguards that made those assumptions defensible.
Consider corpus selection. In standard distant reading practice, the corpus is assembled according to availability, prior canonical judgment, or the outputs of digitization projects, and is then treated as the ground on which analysis proceeds. From a design-based perspective, corpus selection is instead a moment of assumption-making that must be theorized explicitly: what counterfactual is implied by this particular set of texts?3
3. Toward a Design-Based Humanities
The natural experiment model is not, of course, directly transferable to the humanities. Historical and literary materials rarely permit the kind of exogenous variation that defines a true natural experiment in political science or economics. What the framework offers instead is a vocabulary for making explicit the design choices that computational humanities researchers already make implicitly.
A design-based computational humanities would begin not with a corpus and a method but with a research design: a specification of the variation to be exploited, the assumptions required to interpret that variation causally, and the limitations those assumptions impose on the conclusions that can be drawn.4
Lab Notes is precisely the kind of venue in which such methodological reflection can take place: short, process-oriented, and uncoupled from the pressure to present finished results. A note is an appropriate form for an observation that is not yet a finding — for a moment of reading that reorganizes the assumptions of a larger project still in progress.
- Bode, Katherine. "The Equivalence of 'Close' and 'Distant' Reading." Modern Language Quarterly 78.1 (2017): 77–106.
- Dunning, Thad. Natural Experiments in the Social Sciences: A Design-Based Approach. Cambridge: Cambridge University Press, 2012.
- Drucker, Johanna. "Humanities Approaches to Graphical Display." Digital Humanities Quarterly 5.1 (2011).
- King, Gary, Robert O. Keohane, and Sidney Verba. Designing Social Inquiry. Princeton: Princeton University Press, 1994.
- Moretti, Franco. Distant Reading. London: Verso, 2013.
- Piper, Andrew. Enumerations: Data and Literary Study. Chicago: University of Chicago Press, 2018.
- Underwood, Ted. Distant Horizons: Digital Evidence and Literary Change. Chicago: University of Chicago Press, 2019.
- Tenen, Dennis Yi. Plain Text: The Poetics of Computation. Stanford: Stanford University Press, 2017.
- Jockers, Matthew L. Macroanalysis: Digital Methods and Literary History. Urbana: University of Illinois Press, 2013.
- Ramsay, Stephen. Reading Machines: Toward an Algorithmic Criticism. Urbana: University of Illinois Press, 2011.
Notes
- 1Dunning, Natural Experiments, 4. The emphasis on design echoes earlier methodological discussions in King, Keohane, and Verba's Designing Social Inquiry (1994).
- 2Dunning, Natural Experiments, 7. See also Drucker's critique of quantitative data visualization in humanistic contexts.
- 3The critique of corpus construction in distant reading has been developed by Bode (2017) and others.
- 4Piper's Enumerations (2018) and Underwood's Distant Horizons (2019) both demonstrate attentiveness to research design that goes well beyond method selection.