David B. D'Ambrosio, Joel Lehman, Sebastian Risi, and Kenneth O. Stanley (2011)
Task Switching in Multiagent Learning through Indirect Encoding
In: Proceedings of the International Conference on Intelligent Robots and Systems (IROS 2011, San Francisco, CA). Piscataway, NJ: IEEE (8 pages).

Note: This article is accompanied by a demonstration video at http://eplex.cs.ucf.edu/patrolling.html

Abstract 

Multirobot domains are a challenge for learning algorithms because they require robots to learn to cooperate to achieve a common goal. The challenge only becomes greater when robots must perform heterogeneous tasks to reach that goal. Multiagent HyperNEAT is a neuroevolutionary method (i.e. a method that evolves neural networks) that has proven successful in several cooperative multiagent domains by exploiting the concept of policy geometry, which means the policies of team members are learned as a function of how they relate to each other based on canonical starting positions. This paper extends the multiagent HyperNEAT algorithm by introducing situational policy geometry, which allows each agent to encode multiple policies that can be switched depending on the agent’s state. This concept is demonstrated both in simulation and in real Khepera III robots in a patrol and return task, where robots must cooperate to cover an area and return home when called. Robot teams that are trained with situational policy geometry are compared to teams that are not and shown to find solutions more consistently that are also able to transfer to the real world.