Addressing Hindsight Bias in Multigoal Reinforcement Learning

Chenjia Bai, Lingxiao Wang, Yixin Wang, Zhaoran Wang, Rui Zhao, Chenyao Bai, Peng Liu

Research output: Contribution to journalArticlepeer-review

Abstract

Multigoal reinforcement learning (RL) extends the typical RL with goal-conditional value functions and policies. One efficient multigoal RL algorithm is the hindsight experience replay (HER). By treating a hindsight goal from failed experiences as the original goal, HER enables the agent to receive rewards frequently. However, a key assumption of HER is that the hindsight goals do not change the likelihood of the sampled transitions and trajectories used in training, which is not the fact according to our analysis. More specifically, we show that using hindsight goals changes such a likelihood and results in a biased learning objective for multigoal RL. We analyze the hindsight bias due to this use of hindsight goals and propose the bias-corrected HER (BHER), an efficient algorithm that corrects the hindsight bias in training. We further show that BHER outperforms several state-of-the-art multigoal RL approaches in challenging robotics tasks.

Original languageEnglish (US)
JournalIEEE Transactions on Cybernetics
DOIs
StateAccepted/In press - 2021

Keywords

  • Cybernetics
  • Heuristic algorithms
  • Hindsight bias
  • hindsight experience replay (HER)
  • Manipulators
  • Reinforcement learning
  • reinforcement learning (RL)
  • Task analysis
  • Training
  • Trajectory

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Information Systems
  • Human-Computer Interaction
  • Computer Science Applications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Addressing Hindsight Bias in Multigoal Reinforcement Learning'. Together they form a unique fingerprint.

Cite this