GGrantIndex
← Search

Statistical Estimation from Decoupled Data

$249,999FY2020MPSNSF

New York University, New York NY

Investigators

Abstract

Modern statistics is defined by the fact that a great deal more data is available to practitioners than ever before. This is particularly the case in the sciences, where advances in experimental methodology across fields such as biology, chemistry, and physics have led to an explosion of different types of data, collected by different measurement apparatuses at potentially different times. Moreover, it may be difficult or impossible to connect data points from different experiments. For example, a chemist may apply two different measurement techniques to the same batch of molecules to obtain high-quality data about the whole batch, but it may be challenging to track the identities of particular molecules between measurements. The statistician who wishes to make the best possible inferences is faced with the difficult problem of how to integrate the data from different sources to conduct a unified analysis. Despite the ubiquity of this problem, rigorous statistical analyses of procedures designed to work with decoupled data are rare. The main goal of this project is to develop new tools for performing estimation tasks with decoupled data and to establish the fundamental limits of such techniques. This project will have impact on scientific and statistical methodology in both research and industrial settings.The graduate student support will be used on interdisciplinary research and writing codes. The project will investigate optimal rates of estimation for regression problems given access to decoupled data, and to establish potential trade-offs. Several intermediate regimes will be considered, for example, where the experimenter has access to many independent batches of shuffled data or to data with partial coupling information. This project will quantify the statistical price for learning with decoupled data via tight minimax bounds. This research is also aimed at establishing when minimax statistical procedures can be made computationally efficient, and investigating the possible presence of information theoretic-computational gaps in optimal rates of estimation. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →
Statistical Estimation from Decoupled Data · GrantIndex