All occurrences of this event have passed.
This listing is displayed for historical purposes.

Free Lecture / Discussion

Presented By: Department of Statistics Dissertation Defenses

Ask a Question About This Event

Structured Statistical Learning and Inference for Complex Scientific Data

Name: Structured Statistical Learning and Inference for Complex Scientific Data
Start: 2026-05-26T15:30:00-04:00
End: 2026-05-26T17:00:00-04:00
Location: Virtual

Yang Li

This dissertation develops structured statistical learning and inference methods for complex scientific data. Here, structure refers to problem-specific patterns that can be modeled to improve learning or inference: cluster-specific abundance and presence--absence patterns in microbiome compositions, modular organization in high-dimensional conditional dependence networks, and the conditional predictive structure among outcomes, covariates, and black-box predictions. Modeling such structure can improve clustering, network learning, and inference while preserving interpretability and statistical validity.

The first part studies model-based clustering of microbiome compositional data. We develop an Ising-Dirichlet mixture model for zero-inflated compositions, where each cluster has a presence--absence dependence structure and a nonzero abundance profile. The method is designed to improve clustering with limited samples by using information from both taxon occurrence patterns and relative abundance variation. Simulations and a resistant potato starch study show improved clustering accuracy and interpretable microbiome subgroups.

The second part studies variable clustering in high-dimensional graphical models. We develop a one-step joint estimation framework for a sparse precision matrix and a latent variable partition. This allows graph estimation and partition recovery to reinforce each other, rather than clustering a separately estimated graph. The method treats the partition as an explicit estimation target and allows nonzero cross-cluster dependence, relying on a modularity criterion in which within-cluster connectivity is denser than between-cluster connectivity. Simulations and real-data applications show more stable and interpretable graph-and-cluster representations than two-stage alternatives.

The third part studies statistical inference with limited gold-standard labels and abundant black-box predictions. Because these predictions are not ground truth, valid use requires bias correction. We develop adaptive prediction-powered inference, which learns a score-side adjustment from labeled data to approximate the variance-optimal conditional score adjustment through Taylor-based and ensemble-based constructions. Simulations and real-data examples show that the method preserves coverage while producing smaller confidence regions than existing prediction-powered and surrogate-adjustment methods.

Livestream Information

Livestream
May 26, 2026 (Tuesday) 3:30pm

Join In Browser

Explore Similar Events

Loading Similar Events...

Keywords

Dissertation

0 upcoming occurrence
0 expired occurrence

Happening @ Michigan

The University of Michigan Events Calendar

Sponsors

Keywords

Types

Search Results

Events

Structured Statistical Learning and Inference for Complex Scientific Data

Yang Li

Livestream Information

Explore Similar Events

Keywords

Contact Event Organizers: Department of Statistics Dissertation Defenses

When and Where

Virtual

May 2026

Contact Us

Happening @ Michigan

The University of Michigan Events Calendar

Sponsors

Keywords

Types

Search Results

Events

Structured Statistical Learning and Inference for Complex Scientific Data

Yang Li

Share Event

Livestream Information

Explore Similar Events

Keywords

Contact Event Organizers: Department of Statistics Dissertation Defenses

When and Where

Virtual

May 2026

Contact Us