Recently, I had to opportunity to help J. Warren York, a graduate student in the Department of Politics here at UVa. He’s looking at how tax law affects political contributions and advocacy, so this was an interesting project that may tell us something useful about how the US government works [insert your favorite broken-government joke here].
To do this, he needed to download data from a number of different sources in different formats (JSON, YAML, and CSV), pull it all apart, and put some of it back together in a couple of new data files. One of those sources is the Database on Ideology, Money in Politics, and Elections (DIME). The data from them tells how much people and organizations have contributed to various candidates, PAC, and other groups.
And while I’ve seen worse, it wasn’t the cleanest data file out there. (To get an idea of what the data looks like, you can see a sample of 100 rows from this data file in this Google Sheet.)
Read More: Validating Data with Types