Presented By: Industrial & Operations Engineering
Personalized and Distributed Data Analytics in Heterogeneous Environments
Naichen Shi
About the speaker: Naichen Shi is a Ph.D. candidate in the Industrial & Operations Engineering Department at the University of Michigan. His research focuses on personalized, integrative, and science-informed data analytics. Naichen is interested in developing statistical and optimization tools to address real-world challenges across multiple science and engineering domains, including Digital Twins, advanced manufacturing, and transcriptomics. He has published several papers in both methodological and applied journals and conferences, including the Journal of Machine Learning Research (JMLR), Technometrics, NeurIPS, and the Journal of Manufacturing Systems. Naichen has received four best paper recognitions from the Institute for Operations Research and the Management Sciences (INFORMS), including winning the 2024 INFORMS Data Mining Best General Paper Competition.
Abstract: Data is increasingly being collected from distributed and often heterogeneous sources, such as smartphones, connected vehicles, and healthcare devices. While much effort has focused on predictive learning under heterogeneity, I argue in this talk that predictive modeling, without untangling the nature of heterogeneity across users, may lead to significant failures. With this in mind, I present a descriptive framework called Personalized Principal Component Analysis (PCA) that answers the simple question: What is shared, and what is unique? Specifically, we introduce an efficient algorithm to extract identifiable global and local PCs based on distributed manifold gradient descent. The algorithm is proved to converge linearly and output results with statistical errors that almost match the lower bound. Building on this, I then highlight our research on predictive modeling under heterogeneity and discuss its implications for collaborative machine learning and interoperable Digital Twins.
Abstract: Data is increasingly being collected from distributed and often heterogeneous sources, such as smartphones, connected vehicles, and healthcare devices. While much effort has focused on predictive learning under heterogeneity, I argue in this talk that predictive modeling, without untangling the nature of heterogeneity across users, may lead to significant failures. With this in mind, I present a descriptive framework called Personalized Principal Component Analysis (PCA) that answers the simple question: What is shared, and what is unique? Specifically, we introduce an efficient algorithm to extract identifiable global and local PCs based on distributed manifold gradient descent. The algorithm is proved to converge linearly and output results with statistical errors that almost match the lower bound. Building on this, I then highlight our research on predictive modeling under heterogeneity and discuss its implications for collaborative machine learning and interoperable Digital Twins.
Explore Similar Events
-
Loading Similar Events...