Using topic modeling to automatically generate subject headings from huge digital collections sounds like a great idea...but does it really work in practice? Find out how topic modeling is being investigated for use in digital libraries.
Topic modeling is a new computer science algorithm that “learns” the relationships among items and then assigns topic (subject) headings accordingly. But how well does this work in practice?
We explored this question of usability, utility and value of topic modeling by conducting user studies of collections of digital books (HathiTrust) and images (History of Art, etc.) from UM and Yale. We first focused on measuring the coherence, meaning, and interpretability of learned topics, and from this developed a new method to score topic coherence. After building topics into user interfaces for books and images, and developing test scenarios, we launched a large-scale study (upwards of 700 users) to assess the value of these topics to end-users. The study was conducted using unmoderated user testing software, obviating the presence of anyone other than the user.
We will present preliminary results from that work, and will summarize potential opportunities–along with some known limitations–presented by topic modeling in digital libraries.
Zingerman's refreshments provided!
Topic modeling is a new computer science algorithm that “learns” the relationships among items and then assigns topic (subject) headings accordingly. But how well does this work in practice?
We explored this question of usability, utility and value of topic modeling by conducting user studies of collections of digital books (HathiTrust) and images (History of Art, etc.) from UM and Yale. We first focused on measuring the coherence, meaning, and interpretability of learned topics, and from this developed a new method to score topic coherence. After building topics into user interfaces for books and images, and developing test scenarios, we launched a large-scale study (upwards of 700 users) to assess the value of these topics to end-users. The study was conducted using unmoderated user testing software, obviating the presence of anyone other than the user.
We will present preliminary results from that work, and will summarize potential opportunities–along with some known limitations–presented by topic modeling in digital libraries.
Zingerman's refreshments provided!