Skip to Content

Sponsors

No results

Tags

No results

Types

No results

Search Results

Events

No results
Search events using: keywords, sponsors, locations or event type
When / Where
All occurrences of this event have passed.
This listing is displayed for historical purposes.

Presented By: Department of Statistics

Dissertation Defense: Joint Mean and Covariance Modeling of Matrix-Variate Data

Michael Hornstein

flyer flyer
flyer
We address theory, methodology, and applications for joint mean and covariance estimation with matrix-variate data. Our first project considers joint mean and covariance estimation in the Kronecker product model, which has natural methodological connections to large-scale screening and differential mean analysis in various application areas including genomics. It has been proposed that complex populations, such as those that arise in genomics studies, may exhibit dependencies among observations as well as among variables. This gives rise to the challenging problem of analyzing unreplicated high-dimensional data with unknown mean and dependence structures. Matrix-variate approaches that impose various forms of (inverse) covariance sparsity allow flexible dependence structures to be estimated, but cannot directly be applied when the mean and covariance matrices are estimated jointly. We present a practical method utilizing generalized least squares and penalized (inverse) covariance estimation to address this challenge. We establish consistency and obtain rates of convergence for estimating the mean parameters and covariance matrices. The advantages of our approaches are: (i) dependence graphs and covariance structures can be estimated in the presence of unknown mean structure, (ii) the mean structure becomes more efficiently estimated when accounting for the dependence structure among observations; and (iii) inferences about the mean parameters become correctly calibrated. We use simulation studies and analysis of genomic data from a twin study of ulcerative colitis to illustrate the statistical convergence and the performance of our methods in practical settings. Several lines of evidence show that the test statistics for differential gene expression produced by our methods are correctly calibrated and improve power over conventional methods.

Our second project uses matrix-variate techniques to gain insight into pitch curve data that plays an important role in linguistics research. These curves can be viewed as large multi-indexed data arrays with distinct covariance behaviors along each index. We estimate covariance and inverse covariance matrices and graphs, and we connect edge structures to word properties. By contrast with the first project, the pitch curve data contains a limited number of replicates, which allows us to use trial residualization to remove mean structure. We investigate whether edges are associated with characteristics of the words, including initial consonant, vowel type, and voicing using inverse covariance graphs estimated using graphical lasso and nodewise regression. In particular, we hierarchically decompose the words by consonants and/or by vowels while analyzing edges between individual words as well as word groups categorized by initial consonant or vowel properties.

Explore Similar Events

  •  Loading Similar Events...

Back to Main Content