A Theoretical Foundation for Applications of Bayesian Variable Selection

$38,087FY2007MPSNSF

Northwestern University, Evanston IL

Investigators

Abstract

In the popular approach of `Bayesian Variable Selection' (BVS), one uses prior and posterior distributions to select a subset of candidate variables to enter the model. In some examples, the number of candidate variables `K' can be much larger than the number of study units `n'. This idea has been applied to various statistical models, e.g., regression, graphical models, survival analysis, and cluster analysis. Despite its popularity, theoretical properties, especially frequentist convergence properties, have not been well established. Recently, the investigators have successfully studied the frequentist convergence properties (consistency, convergence rates, and predictive performances) of BVS for generalized linear models. A completely new direction is considered in this project to study BVS with a Gibbs posterior originating in statistical mechanics. In contrast to the usual Bayesian prior which is constructed from a likelihood function, the Gibbs posterior is constructed from a risk function of practical interest (such as the classification error) and aims at minimizing a risk function without modeling the data probablistically. This can improve the performance over the usual Bayesian approach, since the usual Bayesian approach depends on a probability model which may be misspecified. The investigators studies the statistical performance of BVS with a Gibbs posterior constructed for the purpose of classification. Conditions are provided so that BVS will achieve good classification performance, even in the presence of high dimensionality (K>>n). BVS has multi-disciplinary applications that include various practices of data mining, where a few important variables are to be selected from many candidates and used for prediction and decision making, e.g., pattern recognition, fraud detection, homeland security, customer-oriented marketing decisions, machine learning, microarray analysis, and bioinformatics. The applications typicially involve many candidate variables (sometimes much more than the sample size). BVS, through selecting a few important variables, can be very helpful for interpretation, prediction, and decision making in each of these applications, despite the potentially high dimensionality. The current project will provide a theoretical framework and conditions under which BVS with a Gibbs posterior will be nearly optimal in some sense, despite high dimensionality, therefore providing theoretical justification for this important technique. A solid theoretical foundation for BVS will also lead to better interpretations of the results obtained from BVS, and provide useful information for practitioners on specification of the prior distribution. Such a good theoretical understanding will likely lead to improvement of empirical performance under many circumstances.

View original record on NSF Award Search →