TY - JOUR

T1 - Actor-critic provably finds nash equilibria of linear-quadratic mean-field games

AU - Fu, Zuyue

AU - Yang, Zhuoran

AU - Chen, Yongxin

AU - Wang, Zhaoran

N1 - Publisher Copyright:
Copyright © 2019, The Authors. All rights reserved.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.

PY - 2019/10/16

Y1 - 2019/10/16

N2 - We study discrete-time mean-field Markov games with infinite numbers of agents where each agent aims to minimize its ergodic cost. We consider the setting where the agents have identical linear state transitions and quadratic cost functions, while the aggregated effect of the agents is captured by the population mean of their states, namely, the mean-field state. For such a game, based on the Nash certainty equivalence principle, we provide sufficient conditions for the existence and uniqueness of its Nash equilibrium. Moreover, to find the Nash equilibrium, we propose a mean-field actor-critic algorithm with linear function approximation, which does not require knowing the model of dynamics. Specifically, at each iteration of our algorithm, we use the single-agent actor-critic algorithm to approximately obtain the optimal policy of the each agent given the current mean-field state, and then update the mean-field state. In particular, we prove that our algorithm converges to the Nash equilibrium at a linear rate. To the best of our knowledge, this is the first success of applying model-free reinforcement learning with function approximation to discrete-time mean-field Markov games with provable non-asymptotic global convergence guarantees.

AB - We study discrete-time mean-field Markov games with infinite numbers of agents where each agent aims to minimize its ergodic cost. We consider the setting where the agents have identical linear state transitions and quadratic cost functions, while the aggregated effect of the agents is captured by the population mean of their states, namely, the mean-field state. For such a game, based on the Nash certainty equivalence principle, we provide sufficient conditions for the existence and uniqueness of its Nash equilibrium. Moreover, to find the Nash equilibrium, we propose a mean-field actor-critic algorithm with linear function approximation, which does not require knowing the model of dynamics. Specifically, at each iteration of our algorithm, we use the single-agent actor-critic algorithm to approximately obtain the optimal policy of the each agent given the current mean-field state, and then update the mean-field state. In particular, we prove that our algorithm converges to the Nash equilibrium at a linear rate. To the best of our knowledge, this is the first success of applying model-free reinforcement learning with function approximation to discrete-time mean-field Markov games with provable non-asymptotic global convergence guarantees.

UR - http://www.scopus.com/inward/record.url?scp=85094715993&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85094715993&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:85094715993

JO - Free Radical Biology and Medicine

JF - Free Radical Biology and Medicine

SN - 0891-5849

ER -