Right now Latent Semantic Analysis is the analytical tool I’m finding most useful. By measuring the strength of association between words or groups of words, LSA allows a literary historian to map themes, discourses, and varieties of diction in a given period. This approach, more than any other I’ve tried, turns up leads that are useful for me as a literary scholar. But when I talk to other people in digital humanities, I rarely hear enthusiasm for it. Why doesn’t LSA get more love? I see three reasons.

Read Full Post Here

The point of visualization is usually to reveal as much of the structure of a dataset as possible. But what if the data is sensitive or proprietary, and the person doing the analysis is not supposed to be able to know everything about it? In a paper to be presented next week at InfoVis, my Ph.D. student Aritra Dasgupta and I describe the issues involved in privacy-preserving visualization, and propose a variation of parallel coordinates that controls the amount of information shown to the user.

Read Full Paper Here

Guy Massie and I recently gave a talk at the Carleton University Art Gallery on what we learned this past summer in our attempt to crowdsource local cultural heritage knowledge & memories. With the third member of our happy team, Nadine Feuerherm, we wrote a case study and have submitted it to ‘Writing History in the Digital Age‘. This born-digital volume is currently in its open peer-review phase, so we invite your comments on our work there. Below are the slides from our talk. Enjoy!

View Slides Here

The purpose of this ebook is to provide a brief overview of the Ruby programming language and consider ways Ruby (or any other programming language) can be applied to the day-to-day operations of humanities scholars.  Once you complete this book, you should have a good understanding of Ruby basics, be able to complete basic tasks with Ruby, and hopefully leave with a solid basis that will allow you to continue learning.

Read ebook Here

For our third interview, I am thrilled to have a chance to chat with Brett Bobley, the CIO and Director of the Office for Digital Humanities at the National Endowment for the Humanities. I wanted to catch up with him on how some of the work NEH is supporting under the Digging into Data grants might connect with issues around the preservation and access of digital content.

Read Full Post Here

Like cognitive literary studies, digital humanities must draw on other disciplines, using methods and tools that many humanities scholars aren’t comfortable with. And digital humanities has witnessed similar debates about the extent to which we must immerse ourselves in these other disciplines. Do we, as Stephen Ramsay suggests, have to know how to code, and build things? Do we have to be trained statisticians so that the our text-mining results are “statistically significant? Are we more or less rigorous than the proponents of culturomics, whose work many humanities scholars seem skeptical about? These are questions about method, and interdisciplinarity, and collaboration. And they’re not particularly new questions.

Tim Hitchcock, another member of the ‘With Criminal Intent’ team, has described how online technologies can change the way we access archives. Instead of being forced to navigate the hierarchical structures that archives impose on records, which in turn tend to reflect the workings of the institutions that created the records, we can directly find the people whose lives were regulated, influenced, shaped or controlled by the policies of those institutions.

Instead of merely hearing ‘the institutional voice… in all its stentorian splendour’, he says, we can listen in to ‘the quieter tones uttered by the individual’.[8]

As one of the first of a ‘new style’ of museum online collections, launching several internet generations ago in 2006, the Powerhouse Museum’s collection database has been undergoing a rethink in recent times. Five years is a very long time on the web and not only has the landscape of online museum collections radically changed, but so to has the way researchers, including curators, use these online collections as part of their own research practices.

Digging through five years of data has revealed a number of key patterns in usage, which when combined with user research paints a very different picture of the value and usefulness of online collections.

A number of our Web Science students are doing work analyzing people’s use of Twitter, and the tools available for them to do so are rather limited since Twitter changed the terms of their service so that the functionality of TwapperKeeper and similar sites has been reduced. There are personal tools like NodeXL (a plugin for Microsoft Excel running under Windows) that do provide simple data capture from social networks, but a study will require long-term data collection over many months that is independent of reboots and power outages.