CAREER: Ensemble Kalman Methods and Bayesian Optimization in Inverse Problems and Data Assimilation
University Of Chicago, Chicago IL
Investigators
Abstract
Blending complex predictive models with data is essential in many applications, including numerical weather forecasting, climate science, petroleum engineering, signal processing, and medical imaging. The challenges posed by the increasing complexity of forward models and dynamical systems in inverse problems and data assimilation can be mitigated by the development of computational methods that are derivative-free and require few model evaluations. This project is concerned with two important families of cost-efficient, derivative-free algorithms: ensemble Kalman methods and Bayesian optimization. Rigorous mathematical analyses will established which will contribute to the understanding of these algorithms, determining their potential and limitations in high-dimensional inverse problems and data assimilation. Methodological contributions will focus on the design of novel algorithms and computational frameworks to merge derivative-free optimization with machine learning. Numerical implementations of these new algorithms will be made publicly available. Beyond inverse problems and data assimilation, the principal investigator will also investigate the potential of ensemble Kalman methods and Bayesian optimization in large-scale scientific computing problems where gradients are unavailable or expensive to compute, and in data science applications where privacy is a concern. A central component of the project is the integration of education and research. The principal investigator will engage in the new Preceptor Program, a collaborative initiative to build data science curricula at community colleges in Chicago. This program will create pathways for community college students to transfer to the University of Chicago. In addition, the investigator will complete two books aimed at graduate and upper-level undergraduate students that will incorporate topics drawn from this project. Mentorship of graduate students and development of graduate-level courses is a core part of the project. The project will consist of two interrelated research thrusts on ensemble Kalman methods and Bayesian optimization. Ensemble Kalman methods are popular algorithms in the geophysical sciences, where they are often used with a small ensemble size to keep the number of model evaluations low. The first research thrust of the project will develop a new comprehensive non-asymptotic analysis of ensemble Kalman methods that rigorously explains when and why a small ensemble size may suffice. Previous analyses have focused instead on large ensemble asymptotics that cannot explain the practical success of these algorithms with a small ensemble size. Methodological contributions of this research thrust will be focused on deriving principled frameworks to blend ensemble Kalman methods and machine learning, as well as novel regularization techniques based on hierarchical formulations of inverse problems and data assimilation. The proposed non-asymptotic theory will establish ensemble size requirements for these new methods. The second research thrust of the project will advance Bayesian optimization in graphical and manifold settings by developing new geometry-aware kernels, acquisition functions, and convergence guarantees. The PI will use tools from computational harmonic analysis to obtain approximation guarantees for stochastic processes on manifolds and tools from information theory to obtain regret bounds under mis-specified models. Finally, the investigator will explore synergistic ways to combine ensemble Kalman methods and Bayesian optimization, leveraging the strengths of both families of algorithms to mitigate their weaknesses. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →