From the post: I’ve recently published version 0.3.0 of my USAboundaries R package to CRAN. USAboundaries provides access to spatial data for U.S. counties, states, cities, congressional districts, and zip codes. Of course you can easily get contemporary boundaries from lots of places, but this package lets you specify dates and get the locations for…

Read More

From the resource: This is the last in a series of posts which constitute a “lit review” of sorts, documenting the range of methods scholars are using to compute the distribution of topics over time. The strategies I am considering are: Average of topic weights per year (First Post) Smoothing or regression analysis (Second Post)…

Read More

From the resource: This is the third in a series of posts which constitute a “lit review” of sorts, documenting the range of methods scholars are using to compute the distribution of topics over time. Graphs of topic prevalence over time are some of the most ubiquitous in digital humanities discussions of topic modeling. They…

Read More

In light of word embeddings’ recent popularity, I’ve been playing around with a version called Latent Semantic Analysis (LSA). Admittedly, LSA has fallen out of favor with the rise of neural embeddings like Word2Vec, but there are several virtues to LSA including decades of study by linguists and computer scientists. (For an introduction to LSA…

Read More