CAREER: Extending the reach of empirical Bayes: Calibration, nuisance parameters, likelihood asymptotics, and machine learning

$152,943FY2025MPSNSF

University Of Chicago, Chicago IL

Investigators

Abstract

Scientists across various fields face the challenge of answering numerous related questions with limited or noisy data. For example, genomicists may need to assess thousands of genes using data from only a few subjects, while survey statisticians might analyze average incomes in many towns based on limited surveys. To address these challenges, researchers often borrow information from related questions, a process that requires sophisticated statistical reasoning due to varying underlying characteristics. Empirical Bayes offers a method to enhance individual question inference by sharing information, potentially improving statistical accuracy. However, it relies on strong modeling assumptions, limiting its applicability. This research aims to make empirical Bayes more powerful and accessible by integrating it with modern machine learning and causal inference, demonstrating its effectiveness under fewer assumptions, and developing methods to assess uncertainty more effectively. These advancements will help practitioners utilize empirical Bayes in various scientific and industry contexts without extensive statistical modeling. Additionally, the project will produce a monograph and provide training for students and preceptors on modern data science challenges using empirical Bayes. This research aims to advance empirical Bayes by developing new methodologies and statistical theories to address four key limitations: the assumption of known likelihoods, the treatment of nuisance parameter heterogeneity, the development of nonparametric inference methods, and the integration with machine learning. The project will create new inference methods for primary parameters by using empirical partially Bayes methods and Bayesian nonparametrics, enhancing frequentist Bayes multiple testing theory with guarantees like false discovery rate control. Additionally, the research seeks to learn the unknown Bayes rule through neural networks with a specialized loss function, incorporate James-Stein shrinkage for combining unbiased estimates with semisupervised machine learning, and integrate empirical partially Bayes inference with doubly robust double machine learning for large-scale causal inference. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →