GGrantIndex
← Search

Discriminant Analysis in High-Dimensional Latent Factor Models

$180,000FY2022MPSNSF

Cornell University, Ithaca NY

Investigators

Abstract

This research project concerns classification of high-dimensional features, an important part of statistical learning theory. The project will formulate high-dimensional latent factor models that have a low-dimensional, hidden structure to guarantee successful statistical classification performance based on suitable projections of the high-dimensional data. Results of this research are expected to advance understanding on how to achieve optimal classification. This project has important applications to recent advances in immunology and cancer studies, which revealed that hidden mechanisms can be directly connected to health outcomes. This project offers a principled way to analyze such high-dimensional datasets and will provide computationally efficient classification rules. The project will involve collaboration with computational biologists to validate the new models and methodology. Specifically, this project constructs novel classifiers based on principal component analysis with a necessary debiasing part followed by linear discriminant analysis and develops their statistical and computational properties. This project focuses on study of the important subclass of tuning-free classifiers that interpolate the data, but still possess good predictive power. In addition, this research aims to develop minimax adaptive bounds for the excess misclassification error under general latent factor model specifications and to prove that the new methods achieve these bounds, thereby establishing their rate optimality. Finally, the usefulness of the new techniques will be demonstrated via applications to data from immunology. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →