CRII: RI: Accelerated Stochastic Approximation for Reinforcement Learning

$182,616FY2016CSENSF

Indiana University, Bloomington IN

Investigators

Abstract

This project develops a new class of accelerated learning techniques for reinforcement learning. Reinforcement learning is an approach to autonomous decision-making through trial-and-error interaction with an unknown environment, with a focus on learning incrementally from this stream of data. Reinforcement learning has significant industrial potential, particularly for real-time control systems, such as active network management for energy and search-and-rescue robots, and is already used in a wide range of fields, including robotics, psychology, animal learning and neuroscience. To improve the practical application of reinforcement learning, this project proposes a new class of algorithms with the goal to balance computational complexity and the sample efficiency of learning, which often requires significant computation and memory. This space of algorithms that attempt to balance both requirements has been under-explored for reinforcement learning, and provide exciting opportunities to impact industrial applications and the growing area of computational sustainability. An important aspect of this project will be to implement and study these algorithms on a wide-range of simulated environments, and engage a diverse group of students through courses and summer research. This project develops efficient incremental approximations to summarize gathered samples for improved sample efficiency and an empirical framework to evaluate these algorithms. This new class of accelerated learning techniques formally trade-off computation and accuracy and have many promising extensions and research directions, through a variety of accelerated stochastic gradient descent techniques and incremental matrix approximations. Further, another focus is to develop tools and novel measures for the reinforcement learning community that evaluate this balance between sample efficiency and computational complexity, with the code framework released through an existing open-source platform. This initial systematic exploration of these novel optimization variants will lay the foundation for the long-term goal of improving efficacy of reinforcement learning in industry and for practical autonomous agents.

View original record on NSF Award Search →