Skip to Content

Sponsors

No results

Keywords

No results

Types

No results

Search Results

Events

No results
Search events using: keywords, sponsors, locations or event type
When / Where
All occurrences of this event have passed.
This listing is displayed for historical purposes.

Presented By: Department of Statistics Dissertation Defenses

Thesis Defense: High-Dimensional Statistical Inference: Phase Transition, Power Enhancement, and Sampling

Yinqiu He

Yinqiu He Defense Flyer Yinqiu He Defense Flyer
Yinqiu He Defense Flyer
Abstract:

The ``Big Data'' era features large amounts of high-dimensional data, in which the number of characteristics per subject is large. The high dimensionality of such big data can pose many new challenges for statistical inference, including (I) the failure of classical approximation theory, (II) the loss of statistical power, and (III) the increase of computational cost. This dissertation studies three important problems that arise in this context.

(I) The first part introduces a newly discovered phase transition phenomenon of the widely used likelihood ratio tests. In particular, it is broadly recognized that classical large-sample approximation theory that is valid under finite dimensions may fail under high dimensions. But there is usually a lack of understanding of when such transition happens as the data dimension increases. This issue can hinder the validation of statistical inference in practice. Focusing on the popular likelihood ratio tests, we derive necessary and sufficient conditions characterizing the phase transition boundaries where Wilks' theorem becomes invalid. Based on this, we further obtain sharp characterization of the approximation bias of Wilk's theorem.

(II) The second part proposes a novel adaptive testing framework that can maintain high statistical power against a variety of alternative hypotheses. Particularly, many scientific questions in high-dimensional data analyses can be formulated as testing high-dimensional parameters globally, e.g., testing whether there exists any association between a large number of SPNs and certain heritable disease in genome-wide association studies. In these problems, many existing methods are designed to capture certain directional information in a high-dimensional space and thus only powerful for specific alternatives. To enhance the statistical power, we construct an innovative family of test statistics that can capture the information in different directions of a high-dimensional space. For a broad class of problems, we establish high-dimensional asymptotic theory for the constructed statistics and develop testing procedures that are adaptively powerful across a wide range of scenarios.

(III) The third part concerns the computational challenge of quantifying rare-event probabilities in statistical inference. In particular, analyzing high-dimensional data frequently involves a large number of hypotheses and results in stringent significance thresholds. It is therefore often required to accurately estimate an extreme tail probability of each test statistic. However, analytical formulae are usually unavailable for nontrivial statistics, and naive Monte Carlo methods usually require a huge number of simulations and are computationally costly. Driven by rare-event issues arising from testing covariance structures, we develop an asymptotically efficient importance sampling algorithm to compute the extreme tail probabilities of the popular ratio statistic of the largest eigenvalue to the trace of a Wishart matrix.
Yinqiu He Defense Flyer Yinqiu He Defense Flyer
Yinqiu He Defense Flyer

Explore Similar Events

  •  Loading Similar Events...

Keywords


Back to Main Content