Skip to Content


No results


No results


No results

Search Results


No results
Search events using: keywords, sponsors, locations or event type
When / Where
All occurrences of this event have passed.
This listing is displayed for historical purposes.

Presented By: Department of Statistics

Statistics Department Seminar Series: Noah Simon, Associate Professor, Department of Biostatistics, University of Washington

"Stochastic optimization for online non-parametric estimation"

Noah Simon Noah Simon
Noah Simon
Abstract: Non-parametric modeling of regression functions is an important contemporary topic. Many machine learning (ML) methods can be viewed through the lens of non-parametric estimation. Often we aim to learn a predictive model or decision rule and must select model complexity to balance bias and variance. The optimal model complexity is based on the number of available samples as well as the complexity of the underlying truth. In ML, model complexity is generally selected based on empirical performance (rather than theoretical results), nevertheless theory can help us understand the limits of model performance, and motivate methodologic ideas.

When building ML models with large datasets generally mini-batch based stochastic optimization methods are used: There, a model is iteratively updated based on random small subsamples of our full dataset. This is often framed theoretically by imagining we have a dataset that is continually updated with batches of new samples. This is the so-called “online setting” and may also be an appropriate theoretical framework to use when we imagine that the performance of our method is compute constrained rather than data constrained.

As noted, the optimal complexity of our model (for balancing bias and variance) is dependent on the sample size. Thus, in the online scenario, it is natural to consider a model of growing complexity (as more observations are obtained). This brings up further questions: How do we appropriately increase complexity? Do we need to store and re-use old observations as complexity increases? How computationally efficient can we be without losing predictive performance?

In this talk, I consider these questions in a simplified model. I consider estimating a regression function that is smooth (lives in a Sobolev ellipsoid). I propose an almost painfully simple stochastic gradient descent method (in a linear space of growing dimension) and show that 1) this estimator achieves the minimax L2 estimation error rate; and 2) Under some assumptions, this estimator uses the minimal memory of any estimator that achieves that minimax error rate. I relate this to more classical sieve/projection estimators, and to other stochastic gradient-based methods and ideas. These results also hold for estimating regression functions in an RKHS.

This seminar will be livestreamed via Zoom There will be a virtual reception to follow.
Noah Simon Noah Simon
Noah Simon

Livestream Information

April 2, 2021 (Friday) 10:00am

Explore Similar Events

  •  Loading Similar Events...


Report Event As Inappropriate Contact Event Organizers
Back to Main Content