For some reason I can’t explain, I have had for many years a very keen interest in crime fiction, especially French crime fiction written since the 1950s, roughly. Some of my favorite authors are Léo Malet, Jean-Patrick Manchette, Sébastien Japrisot and Didier Daeninckx. And it is not for no reason that I was drawn to Mitzi Morris’ stylometric murder mystery Poetic Justice. Although I have been teaching a class on the genre some years ago, given a talk about “money and morality in French crime fiction” and dabbled with some relevant Wikipedia articles, my interest in crime fiction somehow never turned into an active research area of mine. Maybe there was just too much of it: think of Georges Simenon‘s 75 Maigret novels, not to mention his more than 100 other works, or of Léo Malet’s 30 Nestor Burma novels and dozens of other crime fiction works, or of Boileau-Narcejac‘s 40 novels published under that name. Sure Balzac was even more productive, but there was only one Balzac! At some point it occurred to me that rather than a problem, this was an advantage: crime fiction is the perfect playground for computational / quantitative methods of text analysis, simply because there is so much relatively homogeneous material to work with.
During the recent holiday season, I spent a few days putting together a nice little collection of French crime fiction of the twentieth century. It is really more of a testing ground than anything aspiring at a complete or representative coverage, but the need to scan, ocr and clean-up most of the texts really did not make anything else possible. In any case, the result of this work was a collection of sixty French crime fiction novels published between 1907 and 2010, with ten novels written by each of the authors represented: Maurice Leblanc, Gaston Leroux, Georges Simenon, Léo Malet, Jean-Patrick Manchette and Didier Daeninckx. The coverage is unequal, the texts are faulty, but in order to see what is possible with these texts, it should be fine.
Read the full post here.