Editors’ Choice: Building Topic Models through Selective Document Exclusion
Earlier this month, I attended the ASIS&T 2011 Annual Meeting where, much to my delight, the paper I co-authored with Miles Efron and Katrina Fenlon was selected for the Best Paper Award.
In Building Topic Models in a Federated Digital Library through Selective Document Exclusion, we presented a way to improve the coherence of algorithmically derived topical models.
The work stems from topic modeling we were doing, first with PLSA and later LDA, on our IMLS DCC research group. The system we are working with brings together cultural heritage content from over a thousand institutions and, as a result contains quite diverse and often problematic metadata.