GGrantIndex
← Search

High Dimension, Low Sample Size Discrimination

$105,000FY2008MPSNSF

University Of Georgia Research Foundation Inc, Athens GA

Investigators

Abstract

Proposed research is motivated from the discrimination problem with high dimension, low sample size data. The investigator studies the intrinsic difficulties of the discrimination problem by exploring asymptotic geometric structure of such data. Three main activities are proposed: a) the asymptotic inconsistency of leave-one-out cross-validation. The study is expected to explain why it shall fail when the number of variables greatly exceeds the number of observations; b) the effect of the relationship between the dimensionality and the sample size on the difficulty of discrimination task; and c) a discriminant direction vector that only exists for the data with high dimension, low sample size. The data points collapse on this direction vector and also are most separated by group labels. The investigator plans to study its theoretical and empirical properties of the procedure such as its optimality, uniqueness, and asymptotic performances. The overall goal is to investigate the nontraditional and unique challenges in high dimension, low sample size discrimination. The proposed approach may be regarded atypical, but it is more relevant to the problem itself. The applications of proposed research include text document classification such as Spam email filter, medical imaging such as functional magnetic resonance imaging, and bioinformatics such as microarray gene expression and proteomics.

View original record on NSF Award Search →
High Dimension, Low Sample Size Discrimination · GrantIndex