Presented By: Department of Statistics
Statistics Department Seminar Series: Peter Bühlmann, Professor, Department of Statistics, ETH Zürich
“Inhomogenous large-scale data: new opportunities for causal inference and prediction”
Abstract:
Large-scale or "big" data usually refers to scenarios with potentially very many variables (large dimension) and large sample size. Such data is most often of "inhomogeneous" nature, i.e., neither being random samples from a common population nor being generated from a stationary distribution. We show how to exploit the advantage of heterogeneity in large datasets. A key ingredient is an invariance principle that leads to new approaches for causal inference and novel causal prediction methods which exhibit "robustness" against potentially adversarial scenarios. As a concrete application, we discuss large-scale gene knock-down experiments in yeast (Saccharomyces Cerevisiae) where computational and statistical methods have an interesting potential for prediction and prioritization of new experimental interventions.
Large-scale or "big" data usually refers to scenarios with potentially very many variables (large dimension) and large sample size. Such data is most often of "inhomogeneous" nature, i.e., neither being random samples from a common population nor being generated from a stationary distribution. We show how to exploit the advantage of heterogeneity in large datasets. A key ingredient is an invariance principle that leads to new approaches for causal inference and novel causal prediction methods which exhibit "robustness" against potentially adversarial scenarios. As a concrete application, we discuss large-scale gene knock-down experiments in yeast (Saccharomyces Cerevisiae) where computational and statistical methods have an interesting potential for prediction and prioritization of new experimental interventions.