Presented By: Michigan Robotics
Probabilistic Deep Learning for Expressive and Reliable World Models
Robotics PhD Defense, Joey Wilson

Co-chairs: Kira Barton & Maani Ghaffari
FRB 2300 and on Zoom
Abstract
In order to plan and execute missions, autonomous robots require informative maps that simultaneously model pertinent information in the environment while also communicating uncertainty. While machine learning methods can enable complex reasoning and tasks, probabilistic methods boast reliable algorithms with quantifiable uncertainty. In this thesis, we study how learning and probabilistic methods can be combined to construct rich world models that fill in gaps from sparse data, model rich semantic information leveraging advancements in neural networks, and model errors or limitations in the map. First, we study mapping in dynamic scenes where combining temporal sequences of data is non-trivial and simple accumulation of data can leave artifacts or traces. We propose novel techniques that extend semantic scene completion networks to consider temporal information, develop a new dataset without artifacts, and demonstrate how scene flow predictions can be leveraged to remove artifacts from moving objects. Next, we consider how object geometry can be leveraged to perform probabilistic mapping in the Bayesian Kernel Inference (BKI) framework. We propose novel algorithms that structure the BKI operation as a convolutional neural network layer with per-category geometry, as well as extend the idea of probabilistic inference using ellipsoids to 3D-Gaussian Splatting (3D-GS). To address the open-ended nature of real-world environments, we create an open-vocabulary probabilistic mapping update method with quantifiable uncertainty, which naturally extends BKI to neural network latent predictions by adopting a Gaussian distribution likelihood compared to the categorical distribution employed for semantic mapping. Finally, we demonstrate how uncertainty estimates can guide active perception through the application of optimal experimental design. Specifically, we show how uncertainty in 3D-GS maps can be used to select informative viewpoints and keyframes, enabling robots to actively and efficiently navigate to reduce map uncertainty during exploration. All proposed algorithms are evaluated on a combination of synthetic and real-world data, with several real-world robot experiments. Overall, this thesis explores a unified framework for robotic world modeling that integrates modern machine learning with classical probabilistic methods, enabling richer, more reliable maps for decision-making in challenging and uncertain environments.
FRB 2300 and on Zoom
Abstract
In order to plan and execute missions, autonomous robots require informative maps that simultaneously model pertinent information in the environment while also communicating uncertainty. While machine learning methods can enable complex reasoning and tasks, probabilistic methods boast reliable algorithms with quantifiable uncertainty. In this thesis, we study how learning and probabilistic methods can be combined to construct rich world models that fill in gaps from sparse data, model rich semantic information leveraging advancements in neural networks, and model errors or limitations in the map. First, we study mapping in dynamic scenes where combining temporal sequences of data is non-trivial and simple accumulation of data can leave artifacts or traces. We propose novel techniques that extend semantic scene completion networks to consider temporal information, develop a new dataset without artifacts, and demonstrate how scene flow predictions can be leveraged to remove artifacts from moving objects. Next, we consider how object geometry can be leveraged to perform probabilistic mapping in the Bayesian Kernel Inference (BKI) framework. We propose novel algorithms that structure the BKI operation as a convolutional neural network layer with per-category geometry, as well as extend the idea of probabilistic inference using ellipsoids to 3D-Gaussian Splatting (3D-GS). To address the open-ended nature of real-world environments, we create an open-vocabulary probabilistic mapping update method with quantifiable uncertainty, which naturally extends BKI to neural network latent predictions by adopting a Gaussian distribution likelihood compared to the categorical distribution employed for semantic mapping. Finally, we demonstrate how uncertainty estimates can guide active perception through the application of optimal experimental design. Specifically, we show how uncertainty in 3D-GS maps can be used to select informative viewpoints and keyframes, enabling robots to actively and efficiently navigate to reduce map uncertainty during exploration. All proposed algorithms are evaluated on a combination of synthetic and real-world data, with several real-world robot experiments. Overall, this thesis explores a unified framework for robotic world modeling that integrates modern machine learning with classical probabilistic methods, enabling richer, more reliable maps for decision-making in challenging and uncertain environments.