PlumX Metrics
Embed PlumX Metrics

Data-Driven Robust Multi-Agent Reinforcement Learning

IEEE International Workshop on Machine Learning for Signal Processing, MLSP, ISSN: 2161-0371, Vol: 2022-August, Page: 1-6
2022
  • 1
    Citations
  • 0
    Usage
  • 0
    Captures
  • 0
    Mentions
  • 0
    Social Media
Metric Options:   Counts1 Year3 Year

Metrics Details

  • Citations
    1

Conference Paper Description

Multi-agent reinforcement learning (MARL) in the collaborative setting aims to find a joint policy that maximizes the accumulated reward averaged over all the agents. In this paper, we focus on MARL under model uncertainty, where the transition kernel is assumed to be in an uncertainty set, and the goal is to optimize the worst-case performance over the uncertainty set. We investigate the model-free setting, where the uncertain set centers around an unknown Markov decision process from which a single sample trajectory can be obtained sequentially. We develop a robust multi-agent Q-learning algorithm, which is model-free and fully decentralized. We theoretically prove that the proposed algorithm converges to the minimax robust policy, and further characterize its sample complexity. Our algorithm, comparing to the vanilla multi-agent Q-learning, offers provable robustness under model uncertainty without incurring additional computational and memory cost.

Bibliographic Details

Yudan Wang; Yue Wang; Shaofeng Zou; Yi Zhou; Alvaro Velasquez

Institute of Electrical and Electronics Engineers (IEEE)

Computer Science

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know