Presented By: Department of Statistics Dissertation Defenses
Contributions to Expected Shortfall Regression
Shushu Zhang
Expected shortfall (ES), defined as the average over the tail below (or above) a certain quantile of a probability distribution, is a coherent measure to characterize the tail of a distribution in many applications, such as finance, environmental science, and healthcare research. Expected shortfall regression is a framework for analyzing the relationship between the ES of a response variable and a set of covariates. As an application example, in health disparity research, it can uncover the relations between the lower/upper tails of the conditional distribution of a health-related outcome and covariates of the subjects. This thesis is dedicated to three statistical methodologies for expected shortfall regression.
In the first chapter, we propose the high-dimensional expected shortfall linear regression with the lasso penalty to induce sparse estimators. We propose a debiased estimator and establish the asymptotic normality for conducting valid statistical inferences. We illustrate the finite sample performance of the proposed methods through numerical studies and a data application on health disparity. In the second chapter, we study a novel optimization-based approach for linear expected shortfall regression, which relaxes the assumptions made on the conditional quantile models. While the proposed loss function is implicitly defined, we provide a prototype implementation of the proposed approach with some initial expected shortfall estimators based on binning techniques or machine learning methods. With practically feasible initial estimators, we establish the consistency and the asymptotic normality of the proposed estimator. The proposed approach achieves heterogeneity-adaptive weights and therefore often offers efficiency gains over existing approaches in the literature, as demonstrated through simulation studies. In the last chapter, we further extend the framework to model the nonlinear relationship between covariates and the ES of the response, and introduce a novel expected shortfall random forest (ESRF) framework. The proposed ESRF approach integrates subsampling and data-splitting schemes to construct a nonparametric ensemble that jointly estimates conditional quantiles and expected shortfalls. Building upon this framework, we further develop the expected shortfall causal forest (ESCF) to estimate the conditional ES treatment effect, defined as the difference between the conditional ES of potential outcomes. We establish the pointwise consistency and the asymptotic normality for both the ESRF and the ESCF estimators. We illustrate the finite-sample performance of the proposed methods through simulations and an empirical application examining health disparities among low-birthweight infants.
https://umich.zoom.us/j/98910982237
In the first chapter, we propose the high-dimensional expected shortfall linear regression with the lasso penalty to induce sparse estimators. We propose a debiased estimator and establish the asymptotic normality for conducting valid statistical inferences. We illustrate the finite sample performance of the proposed methods through numerical studies and a data application on health disparity. In the second chapter, we study a novel optimization-based approach for linear expected shortfall regression, which relaxes the assumptions made on the conditional quantile models. While the proposed loss function is implicitly defined, we provide a prototype implementation of the proposed approach with some initial expected shortfall estimators based on binning techniques or machine learning methods. With practically feasible initial estimators, we establish the consistency and the asymptotic normality of the proposed estimator. The proposed approach achieves heterogeneity-adaptive weights and therefore often offers efficiency gains over existing approaches in the literature, as demonstrated through simulation studies. In the last chapter, we further extend the framework to model the nonlinear relationship between covariates and the ES of the response, and introduce a novel expected shortfall random forest (ESRF) framework. The proposed ESRF approach integrates subsampling and data-splitting schemes to construct a nonparametric ensemble that jointly estimates conditional quantiles and expected shortfalls. Building upon this framework, we further develop the expected shortfall causal forest (ESCF) to estimate the conditional ES treatment effect, defined as the difference between the conditional ES of potential outcomes. We establish the pointwise consistency and the asymptotic normality for both the ESRF and the ESCF estimators. We illustrate the finite-sample performance of the proposed methods through simulations and an empirical application examining health disparities among low-birthweight infants.
https://umich.zoom.us/j/98910982237