Right now, humanists often have to take topic modeling on faith. There are several good posts out there that introduce the principle of the thing (by Matt Jockers, for instance, and Scott Weingart). But it’s a long step up from those posts to the computer-science articles that explain “Latent Dirichlet Allocation” mathematically. My goal in…
So here we are in 2012, the Year of Code, and we should all be learning to code! Shouldn’t we? Especially if we belong to this community known as Digital Humanities, a field that is endlessly wrestling with its self-definition. Who’s in, who’s out? Is it really necessary to code? Don’t we have to know…
The Department of Education released a draft report about big data and education today. It’s called “Enhancing Teaching and Learning through Educational Data Mining and Learning Analytics,” a title that’s unlikely to win any converts to the notion of a data-curious* view of learning. Part of what’s going to get stuck in the craw is that phrase…
Easter 1982 – thirty years ago! – was spent feeding my latest addiction. Like over a million others, I had acquired the Sinclair ZX 81, which popularised home computing in Britain. It had just one kilobyte of on-board memory; I soon invested in the upgrade to take it up to 16 kilobytes. You used your…
We asked the captain what course of action he proposed to take toward a beast so large, so terrifying, and unpredictable. He hesitated to answer, and then said judiciously: “I think I shall praise it.” – Robert Hass Praise I find the Debates in the Digital Humanities volume terribly upsetting. Before I go any further…
The rapidly growing archive of early modern texts online presents significant new opportunities and necessities for the ways in which we organize it. Addressing such challenges raises important questions for both skeptics and boosters: Are new methods of organization resulting in virtual but less reliable finding aids? Do pressures of modernization encourage resource-strapped organizers of…
We’re pleased to present the inaugural issue of the Journal of Digital Humanities, which represents the best of the work that was posted online by the community of digital humanities scholars and practitioners in the final three months of 2011. We wish to underline this notion of community. Indeed, this new journal is predicated on…
In October 2011 I began a project to make all of my 26 articles published in refereed journals available via UCL’s Open Access Repository –“Discovery”. I decided that as well as putting them in the institutional repository, I would write a blog post about each research project, and tweet the papers for download. Would this affect…
I’m having an interesting discussion with Lisa Rhody about the significance of topic modeling at different scales that I’d like to follow up with some examples. I’ve been doing topic modeling on collections of eighteenth- and nineteenth-century volumes, using volumes themselves as the “documents” being modeled. Lisa has been pursuing topic modeling on a collection of poems, using individual…
One of the striking features of computation is the extent to which forms of pattern matching are required in computer processing. Pattern recognition can be described as a means of identifying repeated shapes or structures which are features of a system under investigation. Whilst we tend to think of patterns as visual, of course they…