Presented By: Weinberg Institute for Cognitive Science
Cognitive Science Seminar: Reinforcement Learning for Sparse-reward Object-interaction Tasks in a First-person Simulated 3D Environment
Wilka Carvalho, U-M
Wilka Carvalho will give a talk titled "Reinforcement Learning for Sparse-reward Object-interaction Tasks in a First-person Simulated 3D Environment"
ABSTRACT
Learning how to execute complex tasks involving multiple objects in a 3D world is challenging under any circumstances, and especially so when there is no ground-truth information about how to use the objects or any opportunity to learn by demonstration. Rewards for completing a task in such a setting are few and far between (sparse rewards), making it difficult for the agent to figure out what to do next. In this work, we show that these challenges can be overcome by including an auxiliary task: learning to predict how objects change upon interaction (the attentive object-model). We show that when this model is used to learn representations of objects, the core learner (a relational RL agent) receives the dense training signal it needs to rapidly find a solution. We demonstrate results in the 3D AI2Thor simulated kitchen environment with a range of challenging food preparation tasks. We compare our method's performance to several related approaches and against the performance of an oracle: an agent that is supplied with ground-truth information about objects in the scene. We find that our model achieves performance closest to the oracle in terms of both learning speed and maximum success rate. With further analysis, we also demonstrate that the attention model is key to the success of our method.
ABSTRACT
Learning how to execute complex tasks involving multiple objects in a 3D world is challenging under any circumstances, and especially so when there is no ground-truth information about how to use the objects or any opportunity to learn by demonstration. Rewards for completing a task in such a setting are few and far between (sparse rewards), making it difficult for the agent to figure out what to do next. In this work, we show that these challenges can be overcome by including an auxiliary task: learning to predict how objects change upon interaction (the attentive object-model). We show that when this model is used to learn representations of objects, the core learner (a relational RL agent) receives the dense training signal it needs to rapidly find a solution. We demonstrate results in the 3D AI2Thor simulated kitchen environment with a range of challenging food preparation tasks. We compare our method's performance to several related approaches and against the performance of an oracle: an agent that is supplied with ground-truth information about objects in the scene. We find that our model achieves performance closest to the oracle in terms of both learning speed and maximum success rate. With further analysis, we also demonstrate that the attention model is key to the success of our method.
Related Links
Explore Similar Events
-
Loading Similar Events...