Evolutionary Reinforcement Learning for Sample-Efficient Multiagent
Coordination
release_rgdv2qdqx5dxld5ffqnpltbg2i
by
Shauharda Khadka and Somdeb Majumdar and Santiago Miret and Stephen
McAleer and Kagan Tumer
2019
Abstract
A key challenge for Multiagent RL (Reinforcement Learning) is the design of
agent-specific, local rewards that are aligned with sparse global objectives.
In this paper, we introduce MERL (Multiagent Evolutionary RL), a hybrid
algorithm that does not require an explicit alignment between local and global
objectives. MERL uses fast, policy-gradient based learning for each agent by
utilizing their dense local rewards. Concurrently, an evolutionary algorithm is
used to recruit agents into a team by directly optimizing the sparser global
objective. We explore problems that require coupling (a minimum number of
agents required to coordinate for success), where the degree of coupling is not
known to the agents. We demonstrate that MERL's integrated approach is more
sample-efficient and retains performance better with increasing coupling orders
compared to MADDPG, the state-of-the-art policy-gradient algorithm for
multiagent coordination.
In text/plain
format
Archived Files and Locations
application/pdf 1.8 MB
file_jvk7d7kigzhtpkz5i2hq2jagkq
|
arxiv.org (repository) web.archive.org (webarchive) |
1906.07315v1
access all versions, variants, and formats of this works (eg, pre-prints)