Understanding and Improving On-Line Planning Methods
Georgia Tech Research Corporation, Atlanta GA
Investigators
Abstract
This is the first year funding of a three year continuing award. A variety of on-line planning methods are used in artificial intelligence including, for example, real-time search methods such as LRTA*, reinforcement-learning methods such as Q-learning, and robot-navigation methods such as D*. The PIs intend to improve the performance of these and other on-line planning methods substantially so that, for example, future robot-navigation methods will be able to map unknown terrain significantly faster than is now possible, yet have the same advantageous properties as existing on-line planning methods. Many on-line planning methods, either always or most of the time, execute actions that move the agent in the perceived direction of the goal, that is, move the agent so that it reduces the estimates of the goal distances the most. However, the PIs preliminary theoretical results show that executing actions that move the agent in the perceived direction of the goal is usually not a good idea. For example, D* does not reach a goal location in unknown terrain with a minimal travel distance in the worst case. The key to improving the performance of these on-line planning methods then is to exploit the distance estimates that they maintain (or can maintain) in a way that is more directly related to the planning or learning objective. The PIs will study the properties of on-line planning methods both theoretically and experimentally, and will develop improved on-line planning methods that have the same interface as the existing methods, which allows users of these methods to easily substitute the new methods for the ones they are currently using. Side benefits of the proposed research include developing a test-bed for the experimental evaluation of robot navigation methods in unknown terrain, and creating a solid theoretical foundation for understanding robot-navigation methods in unknown terrain, including D*.
View original record on NSF Award Search →