Fast Reinforcement Learning Using Multiple Models and State Decomposition

$154,244FY2014ENGNSF

Indiana University, Bloomington IN

Investigators

Abstract

This project attempts to develop better methods for Reinforcement Learning and Approximate Dynamic Programming (RLADP), in order to be able to handle decision tasks with greater complexity both in time and in space. Reinforcement learning systems are systems which can learn to maximize any measure of performance or satisfaction, based on their experience of observing their environment, acting on the environment, and receiving feedback on performance, similar to the pain or pleasure which is used to reinforce animal behavior. Current reinforcement learning methods do not learn fast enough to perform well, when their environment is too complex in space or in time. This project will develop new methods to handle that kind of complexity. The team will also have a collaboration with IBM research, and will try to address a testbed problem involving the management of a fleet of plug-in hybrid cars. Complexity in time will be handled by use of a multiple model approach, connecting various options or skills by evaluation and updating of the landmark states which mark transitions between different regions of state space. This is similar to previous work on decision blocks and modified Bellman equations previously presented at the PI's workshop on learning and adaptive systems, but otherwise is a unique, new an important direction. Complexity in space is addressed by a multiagent approach, based on a kind of spatial decomposition.

View original record on NSF Award Search →