CAREER: Value Function Approximation for Control of Complex Systems
Stanford University, Stanford CA
Investigators
Abstract
9985229 Van Roy This proposed research is devoted to the development of streamlined and reliable computational methods for value function approximation. A successful outcome would be approximation algorithms that are widely-accessible and effective in the control of complex systems. Proposed approximation methods build on work in the area of neuro-dynamic programming which is sometimes called "Approximate Dynamic Programming" or "Reinforcement Learning." Algorithms that will be developed are based on approximate value iteration, temporal-difference, learning, and linear programming. A method for "feature selection" involving the use of value functions associated with simplified problems will also be explored. To promote a pragmatic view of methods under development, and to provide a testbed for evaluation of ideas, two applications have been chosen to play integral roles in the project: dynamic risk management and the control of multiclass queuing networks. The educational component of this project includes a new graduate level course on neurodynamic programming together with a realignment of current courses to incorporate a greater emphasis on computation, to foster an appreciation for the use of approximations when system become more complex, and to promote a unified view of stochastic control problems across many disciplines. ***
View original record on NSF Award Search →