Interface of Statistical Learning and Optimal Decisions

$500,000FY2022MPSNSF

Princeton University, Princeton NJ

Investigators

Abstract

Massive datasets are routinely collected in the fields of biological, natural, and social sciences, and engineering and have had a huge impact on statistical analysis, personalized treatments, and decision-making. The driving engines behind these successes are the representation power of deep learning and the dynamic policy optimization framework of Markov decision processes, in addition to the availability of big data. However, training algorithms still take enormous amounts of time and computing power, while statistical and algorithmic efficiencies are also still poorly understood. The aim of this project is to understand and improve statistical methods used in deep learning, reinforcement learning, and big data analysis, with an emphasis on the interfaces between statistical modeling and optimal policy learning. It aims to advance knowledge in AI research, automatic driving and control, e-commerce, molecular mechanisms, biological processes, genetic associations, brain functions, and economic and financial risks. The project will integrate research and education by working closely with undergraduate students, graduate students, and postdoctoral fellows, and develop publicly available computer software with sound theoretical support. The project aims at developing and understanding various new statistical methods used in deep learning, introducing statistical modeling and learning techniques to enhance policy optimization in reinforcement learning, and addressing several important issues in the analysis of big data. The first aim is to provide a theoretical understanding of various techniques used in deep learning. The investigator will study the role of over-parametrization in nonlinear models and low-rank matrix recoveries, understanding minimum norm interpolation and elucidating the interactions between neural network models and the tails of the data distribution. The second aim is to study the interface between statistical modeling and optimal decision. The investigator plans to study contextual dynamic pricing using semiparametric models and structured nonparametric models and to unveil the statistical theory that underpins the success of deep reinforcement learning from an adaptive function approximation point of view using hierarchical composition models. The investigator will also introduce new dimensionality reduction techniques and theories for policy learning to improve both statistical and algorithmic efficiencies. The third aim is to address several stylized issues in big data analytics. These include Markovian dependence, missing data, highly correlated measurements, censored responses, and distributed data, among others. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →