Skip to Content

Sponsors

No results

Tags

No results

Types

No results

Search Results

Events

No results
Search events using: keywords, sponsors, locations or event type
When / Where
All occurrences of this event have passed.
This listing is displayed for historical purposes.

Presented By: Department of Statistics

Oral Prelim: Mikhail Yurochkin, New algorithms for Topic Modeling

Topic Modelling is a class of exploratory algorithms applied to text, image, audio or video data. The goal is to find latent topics to summarize and understand huge collections of data. We develop two new topic modelling algorithms and design a new mathematical formulation of the problem.

In the first part, we describe a novel geometric view of the problem and develop a fast and efficient algorithm based on k-means to capture the geometric structure. We demonstrate performance of the algorithm and compare it to established techniques based on the simulated data.

In the second part, we take the common probabilistic formulation of the model and address inference inefficiencies of the currently used algorithms. We design a new inference procedure based on Metropolis Hastings and suggest a new method of proposing candidates for high dimensional probability vectors via Generalized Beta distribution. We also consider supervised setting, where documents have class labels and generalize our algorithm to this case. Performance is evaluated with several simulation studies and a political blogs data set, where each document is labeled either liberal or conservative.

In the third part we propose EM algorithm based on closed form posterior approximation with Carlson's multiple hypergeometric functions.

Explore Similar Events

  •  Loading Similar Events...

Back to Main Content