Skip to Content

Sponsors

No results

Keywords

No results

Types

No results

Search Results

Events

No results
Search events using: keywords, sponsors, locations or event type
When / Where
All occurrences of this event have passed.
This listing is displayed for historical purposes.

Presented By: Department of Statistics Dissertation Defenses

Contributions to Distributed Learning and Selective Inference

Yumeng Wang

The rapid growth and distributed nature of modern datasets pose significant challenges in statistical learning and inference. Privacy concerns often prohibit direct data sharing across sites, while distributional heterogeneity complicates accurate modeling and inference. Moreover, high-dimensional data and model selection procedures necessitate statistical methods for valid inference post-selection. This dissertation addresses these challenges by developing methodologies in distributed statistical learning, inference with heterogeneous data, and selective inference.

In the first part of the dissertation, we propose a novel one-shot distributed learning algorithm via refitting bootstrap samples. We demonstrate that the proposed estimator achieves full-sample statistical rates with only one round of communication of subsample-based statistics in generalized linear models and noisy phase retrieval. We further extend this approach to an iterative algorithm and apply it to convolutional neural networks (CNNs), which exhibits superior performances over existing methods in simulation studies.

In the second part of the dissertation, we develop a novel one-shot distributed learning algorithm to address cross-site heterogeneity. The proposed method effectively accommodates heterogeneity by allowing nuisance parameters to vary across sites. We show that the proposed estimator attains the full-sample statistical error rate and efficiency with only a single round of communication of local estimators. Our simulation studies support these theoretical findings.

In the third part of the dissertation, we introduce an asymptotic pivot to infer about the effects of selected variables on conditional quantile functions. Utilizing estimators from smoothed quantile regression, our proposed pivot is easy to compute and yields asymptotically-exact selective inference without making strict distributional assumptions about the response variable. By employing external randomization, our approach fully utilizes the data for both selection and inference, outperforming traditional methods like data splitting by consistently delivering shorter and more reliable confidence intervals. Simulation studies and an empirical application analyzing risk factors for low birth weight validate the practical efficacy of our method.

Explore Similar Events

  •  Loading Similar Events...

Keywords


Back to Main Content