Abstract
This paper addresses a link scheduling problem in networks represented by conflict graphs using a distributed learning approach. Each agent in a network controls a single link and has access only to its own state and the states of links in its neighborhood. The goal is to minimize the average packet delay in the multi-agent network. The problem is formulated as a decentralized partially observable Markov decision process (Dec-POMDP). The proposed solution adopts a centralized training and distributed execution paradigm and leverages an on-policy reinforcement learning algorithm. Specifically, the paper employs the multi-agent proximal policy optimization (MAPPO) algorithm with judiciously designed recurrent structures in the neural network. The proposed solution is shown to outperform some widely used schedulers in terms of throughput and delays through simulations.
Original language | English (US) |
---|---|
Title of host publication | 2023 59th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2023 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9798350328141 |
DOIs | |
State | Published - 2023 |
Event | 59th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2023 - Monticello, United States Duration: Sep 26 2023 → Sep 29 2023 |
Publication series
Name | 2023 59th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2023 |
---|
Conference
Conference | 59th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2023 |
---|---|
Country/Territory | United States |
City | Monticello |
Period | 9/26/23 → 9/29/23 |
Funding
The work was supported in part by the National Science Foundation under grant No. 2003098, a gift from Intel Corporation, the SpectrumX Center (NSF grant No. 2132700), and also the Institute for Data, Econometrics, Algorithms and Learning (NSF grant No. 2216970). This work was supported in part by NSF under Contract Nos. CNS-1719384 and IIS-1636772.
Keywords
- Decentralized partially observable Markov decision process (Dec-POMDP)
- dynamic traffic
- multi-agent reinforcement learning (MARL)
- recurrent neural networks
- wireless networks
ASJC Scopus subject areas
- Artificial Intelligence
- Computational Theory and Mathematics
- Computer Networks and Communications
- Computer Science Applications
- Computational Mathematics
- Control and Optimization