From the post:
Digital work in and around the Humanities often involves moving data from one system or format to another. That data often involves complex textual materials in multiple languages and writing systems. One commonly used format is the “Comma-Separated Values” text file. It’s not uncommon to find that characters not used in English get garbled when exported from a spreadsheet program like Microsoft Excel to CSV (or imported from CSV into such a program). What’s going on and how do you make it stop?
Source: Preserving Accented and Non-Roman Characters in CSV Workflows
would you like some help with that?
I’m not being snarky. Right now, I have several friends writing articles that are largely or partly a critique of interrelated trends that go under the names “data” or “distant reading.” It looks like many other articles of the same kind are being written. This is good news! I believe fervently in Mae West’s theory of publicity. “I don’t care what the newspapers say about me as long as they spell my name right.” (Though it turns out we may not actually know who said that, so I guess the newspapers failed.) In any case, this blog post is not going to try to stop you from proving that numbers are neoliberal, unethical, inevitably assert objectivity, aim to eliminate all close reading from literary study, fail to represent time, and lead to loss of “cultural authority.” Go for it! Ideas live on critique.
But I do want to help you “spell our names right.” Andrew Piper has recently pointed out that critiques of data-driven research tend to use a small sample of articles. He expressed that more strongly, but I happen to like the article he was aiming at, so I’m going to soften his expression. However, I don’t disagree with the underlying point! For some reason, critics of numbers don’t feel they need to consider more than one example, or two if they’re in a generous mood.
Read full post here.
In coordination with the McNeil Center for Early American Studies and specialists at the University of Pennsylvania Libraries, in addition to bepress, we have established the Magazine of Early American Datasets (MEAD).
Read full post here.