Presented By: Michigan Lifestage Environmental Exposures and Disease Center
Prediction Error & Model Evaluation for Space-Time Downscaling: case studies in air pollution during wildfires
Environmental Statistics Day Lecture by Donatello Telesca (UCLA)
ABSTRACT:
Public Health Scientists use prediction models to downscale (i.e., interpolate) air pollution exposure where monitoring data is insufficient. This exercise aims to obtain estimates at fine resolutions, so that exposure data may reliably be related to health outcomes. In this setting, substantial research efforts have been dedicated to the development of statistical models capable of integrating heterogenous information to obtain accurate prediction: statistical downscaling models, land use regression, as well as machine learning strategies. However, when presented with the tasks of choosing between models, or averaging models, we find that our understanding of model performance in the absence of independent statistical replications remains insufficient. This lecture is motivated by several studies of air pollution (PM 2.5 and ground-level ozone) during wildfires. We review the basis for cross validation as a strategy for the estimation of the expected prediction error. As these performance measure play a crucial role in model selection and averaging we present a formal characterization of the estimands targeted by different data subsetting strategies, and explore their performance in engineered data settings. A final analysis and a warning about preference inversion is presented in relation to the a 2008 wildfire event in Northern California.
BIO:
Dr. Telesca is Associate Professor of Biostatistics at the University of California Los Angeles. He received a Ph.D. in Statistics from the University of Washington and spent two years at the University of Texas M.D. Anderson Cancer Center as a postdoctoral fellow. His research interests include Bayesian methods in multivariate statistics, functional data analysis, statistical methods in bio- and nano-informatics. Dr. Telesca is a member of the California NanoSystems Institute, the UCLA Jonsson Comprehensive Cancer Center and principal data scientist at Lucid Circuit Inc.
Public Health Scientists use prediction models to downscale (i.e., interpolate) air pollution exposure where monitoring data is insufficient. This exercise aims to obtain estimates at fine resolutions, so that exposure data may reliably be related to health outcomes. In this setting, substantial research efforts have been dedicated to the development of statistical models capable of integrating heterogenous information to obtain accurate prediction: statistical downscaling models, land use regression, as well as machine learning strategies. However, when presented with the tasks of choosing between models, or averaging models, we find that our understanding of model performance in the absence of independent statistical replications remains insufficient. This lecture is motivated by several studies of air pollution (PM 2.5 and ground-level ozone) during wildfires. We review the basis for cross validation as a strategy for the estimation of the expected prediction error. As these performance measure play a crucial role in model selection and averaging we present a formal characterization of the estimands targeted by different data subsetting strategies, and explore their performance in engineered data settings. A final analysis and a warning about preference inversion is presented in relation to the a 2008 wildfire event in Northern California.
BIO:
Dr. Telesca is Associate Professor of Biostatistics at the University of California Los Angeles. He received a Ph.D. in Statistics from the University of Washington and spent two years at the University of Texas M.D. Anderson Cancer Center as a postdoctoral fellow. His research interests include Bayesian methods in multivariate statistics, functional data analysis, statistical methods in bio- and nano-informatics. Dr. Telesca is a member of the California NanoSystems Institute, the UCLA Jonsson Comprehensive Cancer Center and principal data scientist at Lucid Circuit Inc.
Related Links
Explore Similar Events
-
Loading Similar Events...