GGrantIndex
← Search

CRII: CIF: Towards Linear-Time Computation of Structured Data Representations

$173,282FY2016CSENSF

Iowa State University, Ames IA

Investigators

Abstract

Estimating an unknown object from noisy, nonlinear, and incomplete observations constitutes a basic problem in data science. Standard solutions first assume that the unknown object obeys a structured mathematical representation, and then develop optimization algorithms for recovering the parameters of the representation. Moreover, rigorous analysis reveals that several such algorithms are statistically optimal. However, despite these advances in statistical understanding, the role of computation is far less well understood; in many cases, even the best methods incur a computation time proportional to a high-degree polynomial in terms of the data size. Therefore, to reap the full benefits of these methods in applications involving massive data, new algorithmic approaches are necessary. This research project introduces new theory and computational tools for learning structured data representations in linear running time. The new algorithmic approaches are based on the intuition that challenging optimization problems encountered in data analysis can be circumvented if the answers are merely approximate, rather than exact. Establishing precise tradeoffs between statistical performance, approximation quality, and running time is a key focus. With this intuition, the project addresses three specific problems: (i) Reconstructing signals and images from nonlinear observations. (ii) Recovering graphs from partial observations, such as linear sketches or inter-node distance measurements. (iii) Estimating multidimensional probability distributions from random samples. Algorithms developed within the scope of the project impact applications ranging from medical imaging, computer network monitoring, social network analysis, and non-destructive evaluation (NDE). All publications, data, and source code are publicly available. The project involves the active participation of both graduate and undergraduate students, and exposes them to a span of areas including mathematics, statistics, computer science, and optimization.

View original record on NSF Award Search →