GGrantIndex
← Search

RI: Small: Reinforcement Learning in Partially Observable Multi-Agent Tasks

$247,664FY2015CSENSF

University Of Southern Mississippi, Hattiesburg MS

Investigators

Abstract

The goal of this project is to develop new techniques for the design of robot controllers (programs that drive individual autonomous robots) for multi-robot tasks, i.e., tasks involving a collaborative team of robots. Existing techniques that have sound decision and game theoretic properties address simple multi-agent systems. In practice, robot controllers are still largely designed manually. Little is known about the decision theoretic optimality of such controllers, especially in multi-robot settings. This project aims to change that by bridging the gap between multi-agent decision theory and multi-robot control, via three main thrusts. Two of these thrusts seek to modify the ways by which multi-agent decision theory describes and computes these controllers, so as to be applicable to multi-robot tasks. The third thrust seeks to make the new computational approach scale to large robot teams. The project has mixed (theoretical and applied) scope, and will lay the foundation for more principled design of future multi-robot systems. The project will also have broad educational impact. The research will be conducted in collaboration with a graduate and an undergraduate student, and involve hands-on robotics experience for students, as well as generate material for a new undergraduate course on robotics. Any theory and algorithms developed will be shared publicly. More specifically, the first main thrust of this project will investigate the expression and game theoretic optimization of behavior based controllers---a popular controller language in the robotics community. Non-linear programming techniques will be used for optimization of modular behaviors, that will be integrated in a hirerchical fashion from simpler to complex tasks. The second main thrust will utilize multi-agent reinforcement learning for the agents/robots to compute their own controllers via joint exploration in task simulations, exploiting inductive knowledge transfer to bootstrap the learning of complex behaviors from simpler, related ones. The third main thrust will develop a new paradigm of multi-agent reinforcement learning---Reinforcement Learning as a Rehearsal (RLaR). Agents will learn in a supervised setting with hidden as well as observable information to reduce sample complexity, but marginalize out the hidden features to yield controllers usable in partially observable settings.

View original record on NSF Award Search →