Modelling Covariance Structure Randomly, with Applications in Bootstrapping, Robust Statistics, and Deep Learning

$200,000FY2021MPSNSF

University Of California-Davis, Davis CA

Investigators

Abstract

High dimensional data with random covariance structure arises naturally in many fields, ranging from genetics and epidemiology to atmospheric and environmental sciences, medical sciences, social sciences and artificial intelligence. A thorough understanding of these large data matrices is in urgent demand in the era of Big Data. However, the current literature has mostly focused on the case when the population covariance matrix is deterministic. This project will change this by establishing novel results for data matrices with random population covariance structure. The research findings will demonstrate how the random covariance structure can help to enhance the understanding of complex and massive data sets. Moreover, this project will result in novel and better modeling and tools for analyzing high dimensional noisy data sets, which can provide more meaningful and interpretable information. This project aims to study a general class of large dimensional matrices with random population covariance structure and address several challenges in high dimensional statistics and deep learning. It is the first time that the intersection between random matrix theory and extreme value theory is studied in full generality. The goals include: (1). Establishing novel theory and results for the top eigenvalues and principal components of large dimensional sample or separable covariance matrix with random covariance structure; (2). Answering the question whether bootstrapping is suitable for high dimensional inference, and how we can modify the standard bootstrapping procedure when it fails for massive data; (3). Constructing new statistics for statistical inference problems involving high dimensional elliptically distributed data in full generality, including heavy tailed data sets; (4). Providing novel insights on the phase transitions of fully connected two-layer neural networks. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →