Publication: Encouraging Cooperation in Multi Agent Reinforcement Learning
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Encouraging cooperation in Multi-agent reinforcement learning (MARL) remains a big area of research. In addition, additional complexity as well as non-stationarity when scaling up from the single-agent setting makes convergence to optimal policies difficult compared to single-agent reinforcement learning. In this thesis, we build on previous work demonstrating the empirical effectiveness of policy-gradient methods in multi-agent settings, specifically Proximal Policy Optimization(PPO). We introduce a novel test-bed for multi-agent reinforcement learning and evaluate the effectiveness of a decentralized PPO framework in this test-bed. Furthermore, motivated by literature that shows the benefits of reward shaping on convergence in the single agent setting, we apply domain specific reward shaping to our PPO method to encourage cooperation and faster convergence to a good joint policy.