Collaborative Research: Optimal Bayesian Concentration Rates from Double Empirical Priors
North Carolina State University, Raleigh NC
Investigators
Abstract
Statisticians frequently encounter problems that involve complicated models with high-dimensional parameters, particularly in "big data" settings. From a Bayesian perspective, it is imperative in these problems that the prior distribution be chosen to sit in a good position. Information about where is a good starting position can come from the data. There is a potential danger with this basic strategy, namely, that a double use of data might cause the model to track the data too closely, resulting on over fitting. To avoid this, the PIs introduce a regularization technique that suitably re-weights the likelihood, preventing the model from learning too quickly. This general "double empirical Bayes" strategy, where the prior is centered on the data and the likelihood is re-weighted, will be applied to several important and challenging high-dimensional problems, including estimation of sparse high-dimensional precision matrices, which is relevant to estimation of large complex networks. In this project, the PIs will develop this new double empirical Bayes framework for inference on high-dimensional parameters with a relatively low "complexity" or "effective dimension". For example, in function estimation problems, posited smoothness on the function is a constraint on its complexity. The first step of the double empirical Bayes strategy is to use a prior, indexed by the complexity of the parameter, centered at a complexity-specific estimate of the parameter based on data. To prevent the posterior from tracking the data too closely, the second step is to re-weight the likelihood to be combined with the data-dependent prior. The result is a sort of posterior distribution on the parameter space, and the PIs will provide general conditions for this posterior to concentrate around the truth at optimal rates. An additional advantage of this new approach is that the complexity-specific priors, for suitable centering, can be taken of relatively simple form, which facilitates computation. The PIs will investigate the double empirical Bayes analysis of several important high-dimensional inference problems, including density and function estimation, variable selection problems in non-linear models, and estimation of sparse precision matrices. Software will be developed for each application.
View original record on NSF Award Search →