Presented By: Department of Mathematics
Marjorie Lee Browne Scholars Mini-Symposium
Ian Augsburger, Ammar Eltigani, and Nicholas Simafranca
You are invited to attend a mini-symposium featuring the 14th cohort of MLB scholars completing the program this month. Each MLB scholar will give a 25-minute presentation of their research and have 5 minutes to answer questions from the audience. The MLB mini-symposium will take place on Wednesday, April 22 from 1:00-2:30pm in 1068EH. Food will be provided during the talks.
Please RSVP by Sunday, April 19 so that we know how many people to expect and can order food accordingly. https://forms.gle/nY719sLt3CdtovG38
---
Ian Augsburger: Efficient Learning of Dirichlet Simplex Models with Asymmetric Concentration Parameters
Abstract: Learning latent topic or mixture models governed by Dirichlet distributions is a central problem in unsupervised learning, with applications ranging from topic modeling to population genetics and biological mixture analysis. Existing approaches—most notably MCMC- and variational-based methods—are often computationally expensive, sensitive to initialization, and particularly brittle in regimes where the Dirichlet concentration parameters are asymmetric or highly skewed.
In this work, we study the problem of efficiently learning Dirichlet Simplex Models, with special emphasis on the practically important but underexplored setting of asymmetric concentration parameters and regimes where individual components dominate the mixture. We show that the second-order moment structure of the observed data encodes the simplex geometry up to an orthogonal transformation on a low-dimensional subspace. Exploiting this structure, we reduce parameter recovery to the problem of learning an orthogonal map.
By introducing a geometry-aware metric aligned with the intrinsic covariance of the data, we obtain a simplified optimization scheme over the orthogonal group that is both stable and fast. Our approach leverages the polytope geometry of the simplex to enable parallelization over symmetry classes, significantly accelerating convergence. Empirically, the resulting algorithm performs remarkably well for asymmetric Dirichlet models, where standard MCMC-based methods often struggle. We view this framework as a step toward efficient, geometry-driven learning algorithms for broader classes of latent variable models.
---
Ammar Eltigani: Random Matrix Theory for High-Dimensional Machine Learning
Abstract: The modern high-dimensional regime—where the number of samples and their dimensions grow proportionally—is ubiquitous in machine learning applications such as finance, healthcare, wireless communications, neuroscience, and computer vision. Despite the remarkable success of large-scale models in this setting, the underlying mathematical reasons remain only partially understood. Why, for instance, do overparameterized neural networks generalize well, and why do their risk curves exhibit double descent?
This expository talk begins by examining how low-dimensional intuitions fail in high dimensions, focusing on covariance estimation. We then introduce the Marčenko–Pastur law, which describes the limiting spectral distribution of sample covariance matrices for white noise. Finally, we discuss its generalization to arbitrary covariances and apply it to ridgeless linear regression, deriving a theoretical risk curve that displays the double-descent phenomenon.
---
Nicholas Simafranca: Learning Low-Dimensional Representations with Heteroscedastic Data Sources
Abstract: Principal component analysis (PCA) is a fundamental method for dimensionality reduction, but it treats all samples uniformly and can perform poorly when data come from sources with unequal noise levels. In this talk, I begin with the classical and probabilistic viewpoints of PCA, introducing probabilistic PCA (PPCA) as a latent-variable model for low-dimensional structure. I then discuss HePPCAT, a heteroscedastic extension of PPCA that allows different groups of samples to have different noise variances. Unlike classical PCA, the resulting maximum-likelihood problem is nonconvex and is not solved by a single eigendecomposition. I will describe how this problem can be approached through alternating majorization-minimization and explain how a Riemannian block MM framework gives a route to proving convergence of a proximalized HePPCAT algorithm to a stationary point.
Please RSVP by Sunday, April 19 so that we know how many people to expect and can order food accordingly. https://forms.gle/nY719sLt3CdtovG38
---
Ian Augsburger: Efficient Learning of Dirichlet Simplex Models with Asymmetric Concentration Parameters
Abstract: Learning latent topic or mixture models governed by Dirichlet distributions is a central problem in unsupervised learning, with applications ranging from topic modeling to population genetics and biological mixture analysis. Existing approaches—most notably MCMC- and variational-based methods—are often computationally expensive, sensitive to initialization, and particularly brittle in regimes where the Dirichlet concentration parameters are asymmetric or highly skewed.
In this work, we study the problem of efficiently learning Dirichlet Simplex Models, with special emphasis on the practically important but underexplored setting of asymmetric concentration parameters and regimes where individual components dominate the mixture. We show that the second-order moment structure of the observed data encodes the simplex geometry up to an orthogonal transformation on a low-dimensional subspace. Exploiting this structure, we reduce parameter recovery to the problem of learning an orthogonal map.
By introducing a geometry-aware metric aligned with the intrinsic covariance of the data, we obtain a simplified optimization scheme over the orthogonal group that is both stable and fast. Our approach leverages the polytope geometry of the simplex to enable parallelization over symmetry classes, significantly accelerating convergence. Empirically, the resulting algorithm performs remarkably well for asymmetric Dirichlet models, where standard MCMC-based methods often struggle. We view this framework as a step toward efficient, geometry-driven learning algorithms for broader classes of latent variable models.
---
Ammar Eltigani: Random Matrix Theory for High-Dimensional Machine Learning
Abstract: The modern high-dimensional regime—where the number of samples and their dimensions grow proportionally—is ubiquitous in machine learning applications such as finance, healthcare, wireless communications, neuroscience, and computer vision. Despite the remarkable success of large-scale models in this setting, the underlying mathematical reasons remain only partially understood. Why, for instance, do overparameterized neural networks generalize well, and why do their risk curves exhibit double descent?
This expository talk begins by examining how low-dimensional intuitions fail in high dimensions, focusing on covariance estimation. We then introduce the Marčenko–Pastur law, which describes the limiting spectral distribution of sample covariance matrices for white noise. Finally, we discuss its generalization to arbitrary covariances and apply it to ridgeless linear regression, deriving a theoretical risk curve that displays the double-descent phenomenon.
---
Nicholas Simafranca: Learning Low-Dimensional Representations with Heteroscedastic Data Sources
Abstract: Principal component analysis (PCA) is a fundamental method for dimensionality reduction, but it treats all samples uniformly and can perform poorly when data come from sources with unequal noise levels. In this talk, I begin with the classical and probabilistic viewpoints of PCA, introducing probabilistic PCA (PPCA) as a latent-variable model for low-dimensional structure. I then discuss HePPCAT, a heteroscedastic extension of PPCA that allows different groups of samples to have different noise variances. Unlike classical PCA, the resulting maximum-likelihood problem is nonconvex and is not solved by a single eigendecomposition. I will describe how this problem can be approached through alternating majorization-minimization and explain how a Riemannian block MM framework gives a route to proving convergence of a proximalized HePPCAT algorithm to a stationary point.