Presented By: Department of Statistics
Statistics Department Seminar Series: Arian Maleki, Department of Statistics, Columbia University
"Accurate and efficient data-point removal for high-dimensional settings"
Abstract: Consider a model trained with ๐ parameters from ๐ independent and identically distributed observations. To assess a data pointโs impact on the model, we remove it from the dataset and aim to understand the modelโs behavior when trained on the remaining data. This scenario is relevant in various classical and modern applications, including risk estimation, outlier detection, machine unlearning, and data valuation. Conventional approaches involve training the model on the remaining data, but these can be computationally demanding. Consequently, researchers often resort to approximate methods. This talk highlights that in high-dimensional settings, where ๐ is either larger than ๐ or at the same order, many approximation methods may prove ineffective. We will present and analyze an accurate approximation method tailored for high-dimensional regimes, elucidating the conditions for its accuracy. In the concluding part of the presentation, time permitting, we will briefly discuss some of the unresolved issues in this domain.
Related Links
Co-Sponsored By
Explore Similar Events
-
Loading Similar Events...