Category: Resources

Resource: Understanding How Beautiful Soup Works

Information icon

From the resource:

Two years ago, when I first grabbed the transcripts of the TED talks, using wget, I relied upon the wisdom and generosity of Padraic C on StackOverflow to help me use Python’s BeautifulSoup library to get the data out of the downloaded HTML files that I wanted. Now that Katherine Kinnaird and I have decided to add talks published since then, and perhaps even go so far as to re-download the entire corpus so that everything is as much the same as possible, it was time for me to understand how BeautifulSoup (hereafter BS4) works for myself.

Read the full resource here.

Resource: An Archive of 8,000 Benjamin Franklin Papers Now Digitized & Put Online

From the post:

Let me quickly pass along some good news from the Library of Congress: “The papers of American scientist, statesman and diplomat Benjamin Franklin have been digitized and are now available online for the first time…. The Franklin papers consist of approximately 8,000 items mostly dating from the 1770s and 1780s. These include the petition that the First Continental Congress sent to Franklin, then a colonial diplomat in London, to deliver to King George III; letterbooks Franklin kept as he negotiated the Treaty of Paris that ended the Revolutionary War; drafts of the treaty; notes documenting his scientific observations, and correspondence with fellow scientists.”

Read more here.

Resource: Free UC Berkeley Data Science Course Online

From the post:

It’s worth passing along a message from UC Berkeley. According to its news service, the “fastest-growing course in UC Berkeley’s history — Foundations of Data Science [aka Data 8X] — is being offered free online this spring for the first time through the campus’s online education hub, edX.” More than 1,000 students are now taking the course each semester at the university. Designed for students who have not previously taken statistics or computer science courses, Foundations of Data Science will teach you in a three-course sequence “how to combine data with Python programming skills to ask questions and explore problems that you encounter in any field of study, in a future job, and even in everyday life.”

Read more here.

Resource: Enter “The Magazine Rack,” the Internet Archive’s Collection of 34,000 Digitized Magazines

About the resource:

Before we kept up with culture through the internet, we kept up with culture through magazines. That historical fact may at first strike those of us over 30 as trivial and those half a generation down as irrelevant, but now, thanks to the Internet Archive, we can all easily experience the depth and breadth of the magazine era as something more than an abstraction or an increasingly distant memory. In keeping with their apparent mission to become the predominant archive of pre-internet media, they’ve set up the Magazine Rack, a downloadable collection of over 34,000 digitized magazines and other monthly publications.

Read more here.

Resource: Altair for visualization in Python

From the post:

With Altair, you can spend more time understanding your data and its meaning. Altair’s API is simple, friendly and consistent and built on top of the powerful Vega-Lite visualization grammar. This elegant simplicity produces beautiful and effective visualizations with a minimal amount of code.

Read more here.

Resource: Behold the MusicMap

From the post:

A Pandora for the adventurous antiquarian, the highly underrated site Radiooooo gives users streaming music from all over the world and every decade since 1900. While it offers an aural feast, its limited interface leaves much to be desired from an educational standpoint. On the other end of the audio-visual spectrum, clever diagrams like those we’ve featured here on electronic music, alternative, and hip hop show the detailed connections between all the major acts in these genres, but all they do so in silence. Now a new interactive infographic built by Belgian architect Kwinten Crauwels brings together an encyclopedic infographic with an exhaustive musical archive. Though it’s missing some of the features of the resources above, the Musicmap far surpasses anything of its kind online—“both a 23and me-style ancestral tree and a thorough disambiguation of just about every extant genre of music,” writes Fast Company.

Read more here.

Resource: How to Scrape Reddit with Python

From the resource:

Last month, Storybench editor Aleszu Bajak and I decided to explore user data on nootropics, the brain-boosting pills that have become popular for their productivity-enhancing properties. Many of the substances are also banned by at the Olympics, which is why we were able to pitch and publish the piece at Smithsonian magazine during the 2018 Winter Olympics. For the story and visualization, we decided to scrape Reddit to better understand the chatter surrounding drugs like modafinil, noopept and piracetam.

In this Python tutorial, I will walk you through how to access Reddit API to download data for your own project.

Read more here.

Resource: KITAB – Knowledge Information Technology and the Arabic Books

From the post:

KITAB provides a digital tool-box and a forum for discussions about Arabic texts. We wish to empower users to explore Arabic texts in completely new ways and to expand the frontiers of knowledge about one of the world’s largest and most complex textual traditions. We are leading with a tool that detects how authors copied from previous works. Arabic authors frequently made use of past works, cutting them into pieces and reconstituting them to address their own outlooks and concerns. Now you can discover relationships between these texts and also the profoundly intertextual circulatory systems in which they sit.

 Find out more  here.