GGrantIndex
← Search

CAREER: High Dimensional Variable Selection and Risk Properties

$400,000FY2010MPSNSF

University Of Southern California, Los Angeles CA

Investigators

Abstract

High dimensional variable selection plays a pivotal role in contemporary statistical modeling, learning and scientific discoveries. Long-standing theoretical questions in the literature include how high dimensionality regularization methods with general penalties can handle, what the role of penalty functions is, and how to characterize the optimality of variable selection procedures. The investigator proposes to study four interrelated research topics. First, the investigator studies penalized likelihood methods with general penalties, which are widely applied for simultaneously selecting important variables and estimating their effects in high dimensional statistical inference, where the dimensionality can be much larger than sample size. Second, various contexts of high dimensional variable selection beyond penalized likelihood methods including penalized empirical risk and hunting for interactions are investigated. Third, the investigator proposes new principles for model selection when models are possibly misspecified and studies the robustness of various regularization methods for high dimensional variable selection under model misspecification. Fourth, the risk properties and optimality of various high dimensional regularization methods in the contexts of penalized least squares and penalized likelihood are further investigated. The analysis of vast data sets now commonly arises in diverse fields of sciences, engineering and humanities ranging from genomics and health sciences to economics, finance and machine learning. High dimensional data analysis poses numerous challenges to statistical theory, methods and implementations that are not present in smaller scale studies. A major goal of this proposal is to make theoretical and methodological contributions to the important and challenging topic of high dimensional variable selection and statistical inference. These new developments provide unified and systematic understandings of various regularization methods in high dimensions, and allow scientists to analyze high dimensional data with increased efficiency, expediency and interpretability. The proposed work is incorporated into new courses on the state-of-the-art high dimensional statistical learning, and will benefit the training and learning of undergraduates, graduate students, and underrepresented minorities. The proposed work on variable selection in high dimensions will not only help better identify factors that are important to, for example, public health and market risk, but also benefit a broad range of scientists and researchers in various fields.

View original record on NSF Award Search →