Collaborative Research: CIF: Medium: Learning to Control from Data: from Theory to Practice

$800,000FY2022CSENSF

University Of California-Berkeley, Berkeley CA

Investigators

Abstract

Data-driven decision-making is playing an increasingly critical role in today's world with examples ranging from epidemic response to ridesharing optimization. However, learning an optimal control policy from data faces challenges in both the offline and online settings: (a) (Offline) It is unclear how to most efficiently utilize the available dataset which was collected a priori, especially when it does not cover all possible scenarios of interest. (b) (Online) It is unclear how to collect a dataset through minimal interactions with the environment in situations where it may be costly and unsafe to do so. Driven by the need to address these two challenges, this project aims to improve the sample efficiency of reinforcement learning (RL) in both settings. In addition, the project plans to incorporate adaptivity and trustworthiness that are required in practice. Activities complementary to these research thrusts include the training of future leaders of academia, industry, and government by equipping them with fundamental skills in data-driven decision making. The goal of this project is to develop the theory and algorithms for a new generation of data-driven decision rules in order to address critical challenges in modern RL. Specifically, the research agenda aims (i) to design sample-efficient and computationally-efficient algorithms for online and offline RL with function approximation, and (ii) to enhance the adaptivity and trustworthiness of existing RL paradigms. To achieve the first goal, it is proposed to incorporate optimistic exploration for online RL and pessimistic exploitation for offline RL into existing approaches with the help of faithful uncertainty quantification for neural networks. To achieve the second goal, it is proposed to incorporate model selection into existing approaches with the help of tight sample complexity characterizations. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →