RI: Small: Adaptive Metareasoning for Bounded Rational Agents

$404,722FY2018CSENSF

University Of Massachusetts Amherst, Amherst MA

Investigators

Abstract

Metareasoning is the process by which an intelligent agent monitors and controls its own thought processes so as to produce effective action in a timely manner. Just as people must decide when to stop thinking and take action, AI systems also need to be able to interrupt their decision-making process and commit to an action or plan. While people often use heuristic methods to determine the interruption time, this project offers metareasoning techniques that optimize the value of computation and stop planning when the urgency to take action outweighs the anticipated benefit of continued computation. The project transforms the ability of researchers and practitioners to create responsive planning systems by offering easy-to-use, off-the-shelf adaptive metareasoning techniques to control them. Additional areas of broader impact include mentoring of student researchers with special attention to underrepresented groups, a range of outreach activities to local schools, targeted activities to increase diversity in computer science, and industrial collaborations. The approach uses planning algorithms that can be interrupted at any time, offering a tradeoff between runtime and quality of results. To take advantage of this tradeoff, novel metareasoning techniques are developed that overcome the drawbacks of existing methods. The key idea is to replace the reliance on extensive offline experiments by creating new ways to predict performance and adapt the prediction quickly to the specific problem instance at hand. The project answers fundamental questions about the feasibility, efficiency, and scalability of optimizing meta-level control with minimal computational overhead. The main contributions are: (1) online performance prediction methods for efficient meta-level control of anytime algorithms that outperform state-of-the-art methods; (2) a novel approach to create and adapt meta-level control policies online using reinforcement learning techniques; (3) extensions of the above methods to control a portfolio of anytime algorithms, allowing transitions from one algorithm to another using shared intermediate solution representations; and (4) extensions of the above methods to control the internal operation of adjustable anytime algorithms. The team evaluates the new metareasoning techniques on complex computational tasks using a range of anytime algorithms based on different programming paradigms and demonstrates ease of use and significant performance gains relative to existing metareasoning techniques. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →