David B. D’Ambrosio, Joel Lehman, Sebastian Risi, and Kenneth O. Stanley (2010)
Evolving Policy Geometry for Scalable Multiagent Learning
In: Proceedings of the Ninth International Conference on Autonomous Agents and Multiagent Systems (AAMAS-2010). (8 pages)

Note: This paper is accompanied with a set of videos at http://eplex.cs.ucf.edu/mahnaamas2010.html


A major challenge for traditional approaches to multiagent learning is to train teams that easily scale to include additional agents. The problem is that such approaches typically encode each agent's policy separately. Such separation means that computational complexity explodes as the number of agents in the team increases, and also leads to the problem of reinvention: Skills that should be shared among agents must be rediscovered separately for each agent. To address this problem, this paper presents an alternative evolutionary approach to multiagent learning called multiagent HyperNEAT that encodes the team as a pattern of related policies rather than as a set of individual agents. To capture this pattern, a policy geometry is introduced to describe the relationship between each agent's policy and its canonical geometric position within the team. Because policy geometry can encode variations of a shared skill across all of the policies it represents, the problem of reinvention is avoided. Furthermore, because the policy geometry of a particular team can be sampled at any resolution, it acts as a heuristic for generating policies for teams of any size, producing a powerful new capability for multiagent learning. In this paper, multiagent HyperNEAT is tested in predator-prey and room-clearing domains. In both domains the results are effective teams that can be successfully scaled to larger team sizes without any further training.