Competitive multi-agent inverse reinforcement learning with sub-optimal demonstrations

Xingyu Wang, Diego Klabjan*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper considers the problem of inverse reinforcement learning in zero-sum stochastic games when expert demonstrations are known to be sub-optimal. Compared to previous works that decouple agents in the game by assuming optimality in expert policies, we introduce a new objective function that directly pits experts against Nash Equilibrium policies, and we design an algorithm to solve for the reward function in the context of inverse reinforcement learning with deep neural networks as model approximations. To find Nash Equilibrium in large-scale games, we also propose an adversarial training algorithm for zero-sum stochastic games, and show the theoretical appeal of non-existence of local optima in its objective function. In numerical experiments, we demonstrate that our Nash Equilibrium and inverse reinforcement learning algorithms address games that are not amenable to existing benchmark algorithms. Moreover, our algorithm successfully recovers reward and policy functions regardless of the quality of the sub-optimal expert demonstration set.

Original languageEnglish (US)
Title of host publication35th International Conference on Machine Learning, ICML 2018
EditorsAndreas Krause, Jennifer Dy
PublisherInternational Machine Learning Society (IMLS)
Pages8148-8175
Number of pages28
Volume11
ISBN (Electronic)9781510867963
StatePublished - Jan 1 2018
Event35th International Conference on Machine Learning, ICML 2018 - Stockholm, Sweden
Duration: Jul 10 2018Jul 15 2018

Other

Other35th International Conference on Machine Learning, ICML 2018
CountrySweden
CityStockholm
Period7/10/187/15/18

Fingerprint

Reinforcement learning
Demonstrations
Learning algorithms
Experiments

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Human-Computer Interaction
  • Software

Cite this

Wang, X., & Klabjan, D. (2018). Competitive multi-agent inverse reinforcement learning with sub-optimal demonstrations. In A. Krause, & J. Dy (Eds.), 35th International Conference on Machine Learning, ICML 2018 (Vol. 11, pp. 8148-8175). International Machine Learning Society (IMLS).
Wang, Xingyu ; Klabjan, Diego. / Competitive multi-agent inverse reinforcement learning with sub-optimal demonstrations. 35th International Conference on Machine Learning, ICML 2018. editor / Andreas Krause ; Jennifer Dy. Vol. 11 International Machine Learning Society (IMLS), 2018. pp. 8148-8175
@inproceedings{a44f35f8a0044a268ac67578f18f8613,
title = "Competitive multi-agent inverse reinforcement learning with sub-optimal demonstrations",
abstract = "This paper considers the problem of inverse reinforcement learning in zero-sum stochastic games when expert demonstrations are known to be sub-optimal. Compared to previous works that decouple agents in the game by assuming optimality in expert policies, we introduce a new objective function that directly pits experts against Nash Equilibrium policies, and we design an algorithm to solve for the reward function in the context of inverse reinforcement learning with deep neural networks as model approximations. To find Nash Equilibrium in large-scale games, we also propose an adversarial training algorithm for zero-sum stochastic games, and show the theoretical appeal of non-existence of local optima in its objective function. In numerical experiments, we demonstrate that our Nash Equilibrium and inverse reinforcement learning algorithms address games that are not amenable to existing benchmark algorithms. Moreover, our algorithm successfully recovers reward and policy functions regardless of the quality of the sub-optimal expert demonstration set.",
author = "Xingyu Wang and Diego Klabjan",
year = "2018",
month = "1",
day = "1",
language = "English (US)",
volume = "11",
pages = "8148--8175",
editor = "Andreas Krause and Jennifer Dy",
booktitle = "35th International Conference on Machine Learning, ICML 2018",
publisher = "International Machine Learning Society (IMLS)",

}

Wang, X & Klabjan, D 2018, Competitive multi-agent inverse reinforcement learning with sub-optimal demonstrations. in A Krause & J Dy (eds), 35th International Conference on Machine Learning, ICML 2018. vol. 11, International Machine Learning Society (IMLS), pp. 8148-8175, 35th International Conference on Machine Learning, ICML 2018, Stockholm, Sweden, 7/10/18.

Competitive multi-agent inverse reinforcement learning with sub-optimal demonstrations. / Wang, Xingyu; Klabjan, Diego.

35th International Conference on Machine Learning, ICML 2018. ed. / Andreas Krause; Jennifer Dy. Vol. 11 International Machine Learning Society (IMLS), 2018. p. 8148-8175.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Competitive multi-agent inverse reinforcement learning with sub-optimal demonstrations

AU - Wang, Xingyu

AU - Klabjan, Diego

PY - 2018/1/1

Y1 - 2018/1/1

N2 - This paper considers the problem of inverse reinforcement learning in zero-sum stochastic games when expert demonstrations are known to be sub-optimal. Compared to previous works that decouple agents in the game by assuming optimality in expert policies, we introduce a new objective function that directly pits experts against Nash Equilibrium policies, and we design an algorithm to solve for the reward function in the context of inverse reinforcement learning with deep neural networks as model approximations. To find Nash Equilibrium in large-scale games, we also propose an adversarial training algorithm for zero-sum stochastic games, and show the theoretical appeal of non-existence of local optima in its objective function. In numerical experiments, we demonstrate that our Nash Equilibrium and inverse reinforcement learning algorithms address games that are not amenable to existing benchmark algorithms. Moreover, our algorithm successfully recovers reward and policy functions regardless of the quality of the sub-optimal expert demonstration set.

AB - This paper considers the problem of inverse reinforcement learning in zero-sum stochastic games when expert demonstrations are known to be sub-optimal. Compared to previous works that decouple agents in the game by assuming optimality in expert policies, we introduce a new objective function that directly pits experts against Nash Equilibrium policies, and we design an algorithm to solve for the reward function in the context of inverse reinforcement learning with deep neural networks as model approximations. To find Nash Equilibrium in large-scale games, we also propose an adversarial training algorithm for zero-sum stochastic games, and show the theoretical appeal of non-existence of local optima in its objective function. In numerical experiments, we demonstrate that our Nash Equilibrium and inverse reinforcement learning algorithms address games that are not amenable to existing benchmark algorithms. Moreover, our algorithm successfully recovers reward and policy functions regardless of the quality of the sub-optimal expert demonstration set.

UR - http://www.scopus.com/inward/record.url?scp=85057312102&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057312102&partnerID=8YFLogxK

M3 - Conference contribution

VL - 11

SP - 8148

EP - 8175

BT - 35th International Conference on Machine Learning, ICML 2018

A2 - Krause, Andreas

A2 - Dy, Jennifer

PB - International Machine Learning Society (IMLS)

ER -

Wang X, Klabjan D. Competitive multi-agent inverse reinforcement learning with sub-optimal demonstrations. In Krause A, Dy J, editors, 35th International Conference on Machine Learning, ICML 2018. Vol. 11. International Machine Learning Society (IMLS). 2018. p. 8148-8175