Abstract
We develop an approach to improve the learning capabilities of robotic systems by combining learned predictive models with experience-based state-action policy mappings. Predictive models provide an understanding of the task and the dynamics, while experience-based (model-free) policy mappings encode favorable actions that override planned actions. We refer to our approach of systematically combining model-based and model-free learning methods as hybrid learning. Our approach efficiently learns motor skills and improves the performance of predictive models and experience-based policies. Moreover, our approach enables policies (both model-based and model-free) to be updated using any off-policy reinforcement learning method. We derive a deterministic method of hybrid learning by optimally switching between learning modalities. We adapt our method to a stochastic variation that relaxes some of the key assumptions in the original derivation. Our deterministic and stochastic variations are tested on a variety of robot control benchmark tasks in simulation as well as a hardware manipulation task. We extend our approach for use with imitation learning methods, where experience is provided through demonstrations, and we test the expanded capability with a real-world pick-and-place task. The results show that our method is capable of improving the performance and sample efficiency of learning motor skills in a variety of experimental domains.
Original language | English (US) |
---|---|
Pages (from-to) | 337-355 |
Number of pages | 19 |
Journal | International Journal of Robotics Research |
Volume | 42 |
Issue number | 6 |
DOIs | |
State | Published - May 2023 |
Keywords
- Reinforcement learning
- hybrid control
- learning theory
- optimal control
ASJC Scopus subject areas
- Software
- Modeling and Simulation
- Mechanical Engineering
- Electrical and Electronic Engineering
- Artificial Intelligence
- Applied Mathematics