There are some amazing algorithms coming out the computer science community which promise to revolutionize how journalists deal with large quantities of information. But building a tool that journalists can use to get stories done takes a lot more than algorithms. Closing this gap has been one of the most challenging and rewarding aspects of building Overview, and I really think we’ve learned something.
I want to get into the process of going form algorithm to application here, because — somewhat to my surprise — I don’t think this process is widely understood. The computer science research community is going full speed ahead developing exciting new algorithms, but it seems a bit disconnected from what it takes to get their work used. This is doubly disappointing, because understanding the needs of users often shows that you need a different algorithm.
The development of Overview is a story about text analysis algorithms applied to journalism, but the principles might apply to any sort of data analysis system. One definition says data science is the intersection of computer science, statistics, and subject matter expertise. This post is about connecting computer science with subject matter expertise.