Report: Introducing myself to MALLET

Cross-posted at my Emerging Tech in Libraries blog


In my text mining class at GSLIS, we had a lot of ground to
cover. It was easy enough to jump into Oracle SQL Developer and
Data Miner and plug into the Oracle database that had been set
up for us, and we moved on to processing and classifying
merrily. But now, a year later, I’m totally removed from
that infrastructure. I wanted to review my work from that class
before heading to EMDA next (!) week, but reacquainting
myself with Data Miner would require setting up the whole
environment first. Not totally understanding the Oracle
ecosystem, I thought it would be easy enough to set a
VirtualBox and implement the Linux setup as needed, but after
several failures I gave up and decided to try something new. As
it turns out, MALLET not only does classification, but topic
modeling, too — something I’d never done before.