dataset for distant reading - graph

Editors’ Choice: A Dataset for Distant-Reading Literature in English, 1700-1922.

Literary critics have been having a speculative conversation about close and distant reading. It might be premature to call it a debate.

A “debate” is normally a situation where people are free to choose between two paths. “Should I believe Habermas, or Foucault? I’m listening; I could go either way.” Conversation about distant reading is different, first, because there’s not much need to make a choice. Have any critics stopped reading closely? A close reading of The Bourgeois suggests that Franco Moretti hasn’t.

More importantly, this isn’t a debate yet because most of the people involved aren’t free to explore both paths. So far only a tiny number of scholars have actually tried distant reading, and it’s easy to see why. You can wake up tomorrow and try a Foucauldian reading of Frankenstein, but you can’t wake up and trace patterns of change in a thousand novels. In either case, you may need to learn new methods, but in the “distant” case, it can also take years to assemble a collection of texts.

A dataset for distant reading
To reduce barriers to entry, I’ve collaborated with HathiTrust Research Center to create an easier place to start with English-language literature. It’s aimed at scholars studying long-nineteenth-century (1750-1922) fiction and poetry, but it will gradually expand into the twentieth century. This post describes the humanistic uses of the dataset; if you want technical information, there’s more on the page where the data actually lives.

Read More: A dataset for distant-reading literature in English, 1700-1922.

This content was selected for Digital Humanities Now by Editor-in-Chief Amanda Morton based on nominations by Editors-at-Large: Laura Braunstein, Catelynne Sahadath, Bobby Smiley, Matthew Lincoln, Erin Altman, Sasha Frizzell, Melissa Norr, Thomas Rushford, Rebecca Napolitano, Kevin, and Solmaz Mohammadzadeh-Kive.