As I wrote here a couple of weeks ago, I’m playing around with a variety of clustering techniques to identify patterns in legal records from the early modern Spanish Empire. In this post, I will discuss the first of my training experiments using Normalized Compression Distance (NCD). I’ll look at what NCD is, some potential problems with the method, and then the results from using NCD to analyze the Criminales Series descriptions of the Archivo Nacional del Ecuador’s (ANE) Series Guide. For what it’s worth, this is a very easy and approachable method for measuring similarity between documents and requires almost no programming chops.

Read Full Post Here

Historians often hope that digitized texts will enable better, faster comparisons of groups of texts. Now that at least the 1grams on Bookworm are running pretty smoothly, I want to start to lay the groundwork for using corpus comparisons to look at words in a big digital library. For the algorithmically minded: this post should act as a somewhat idiosyncratic approach to Dunning’s Log-likelihood statistic. For the hermeneutically minded: this post should explain why you might need _any_ log-likelihood statistic.

Read Full Post Here

Paradox Number One:  Social media foments revolution, but a sudden removal of social media can increase mobilization and create even more unrest.

We can all stand witness to the ways in which social and news media can spread a movement within and across nations.  I know an Egyptian who claimed that her family and friends knew that the revolution was going to occur in the weeks and days before it actually happened.  How?  Just by the messages on social media and between individuals.  In a similar fashion, social media proposed and flamed the fires of the occupy wall street movement in the weeks before it emerged, grew, and took hold as a real story in mainstream media outlets.

Paper delivered on 29 September 2011 in Special Collections of the University of Cardiff Library as the “Inaugural Annual Cardiff Rare Books and Music Lecture.”

The challenge to our engagement in and with the humanities today is the digital medium. This engagement moves into fresh light and focus in consequence of the medium, since, through the new mediality, ‘what we have always done’ is no longer a matter of course, hence remaining unreflected in itself, but demands instead reflection and questioning.

Read Full Piece Here

If there are two things that academia doesn’t need, they are another book about Darwin and another blog post about defining the digital humanities. But it’s always right around this time of year that I find myself preparing for my digital history course and being pulled down the contemplative rabbit hole about how describe the nature of the digital humanities to a new and varied audience. But rather than create my own definition, I wanted one cobbled together from everyone else.

All of a sudden, I’m starting to pick up signs of a digital humanities backlash. That’s a shame because there’s a big difference between digital humanities and online education since faculty can seemingly control the first thing, but not necessarily the second. The digital humanities help us do what we already do better. Online education…well, since I don’t feel like linking to my entire archive for the last three months, let’s just say I’m not convinced it helps us do anything.

Read Full Post Here

Neither can my father, although both are proficient readers. My sister and her family have multiple televisions, cable, a gaming system and most recently, they have acquired cell phones (the un “smart” sort), but they do not own a computer. This is not their choice. They are regular hard working people, laboring in the service sector in long-held stable, but low paying jobs. They worry about paying for a serviceable car, not the web. Typical of many working class people, they are much less connected to the world through the internet than are their wealthier and more educated peers.

Read Full Post Here