Competitive multi-agent inverse reinforcement learning with sub-optimal demonstrations

Xingyu Wang, Diego Klabjan*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

This paper considers the problem of inverse reinforcement learning in zero-sum stochastic games when expert demonstrations are known to be sub-optimal. Compared to previous works that decouple agents in the game by assuming optimality in expert policies, we introduce a new objective function that directly pits experts against Nash Equilibrium policies, and we design an algorithm to solve for the reward function in the context of inverse reinforcement learning with deep neural networks as model approximations. To find Nash Equilibrium in large-scale games, we also propose an adversarial training algorithm for zero-sum stochastic games, and show the theoretical appeal of non-existence of local optima in its objective function. In numerical experiments, we demonstrate that our Nash Equilibrium and inverse reinforcement learning algorithms address games that are not amenable to existing benchmark algorithms. Moreover, our algorithm successfully recovers reward and policy functions regardless of the quality of the sub-optimal expert demonstration set.

Original languageEnglish (US)
Title of host publication35th International Conference on Machine Learning, ICML 2018
EditorsAndreas Krause, Jennifer Dy
PublisherInternational Machine Learning Society (IMLS)
Pages8148-8175
Number of pages28
ISBN (Electronic)9781510867963
StatePublished - Jan 1 2018
Event35th International Conference on Machine Learning, ICML 2018 - Stockholm, Sweden
Duration: Jul 10 2018Jul 15 2018

Publication series

Name35th International Conference on Machine Learning, ICML 2018
Volume11

Other

Other35th International Conference on Machine Learning, ICML 2018
CountrySweden
CityStockholm
Period7/10/187/15/18

    Fingerprint

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Human-Computer Interaction
  • Software

Cite this

Wang, X., & Klabjan, D. (2018). Competitive multi-agent inverse reinforcement learning with sub-optimal demonstrations. In A. Krause, & J. Dy (Eds.), 35th International Conference on Machine Learning, ICML 2018 (pp. 8148-8175). (35th International Conference on Machine Learning, ICML 2018; Vol. 11). International Machine Learning Society (IMLS).