GGrantIndex
← Search

Computational and Communication Efficient Distributed Statistical Methods with Theoretical Guarantees

$475,000FY2016MPSNSF

Georgia Tech Research Corporation, Atlanta GA

Investigators

Abstract

In many contemporary data-analysis settings, it is expensive and/or infeasible to assume that the entire data set is available at a central location. In recent works of computational mathematics and machine learning, great strides have been made in distributed optimization and distributed learning (i.e., machine learning). On the other hand, classical statistical methodology, theory, and computation are typically based on the assumption that the entire data are available at a central location; this is a significant shortcoming in modern statistical knowledge. The statistical methodology and theory for distributed inference are underdeveloped. The PI will develop new distributed statistical methods that are computation and communication efficient. He will study the theoretical guarantees of these distributed statistical estimators. The applicability and need of these methods in a wide spectrum of application domains will be explored and demonstrated. This research can have impacts in healthcare, supply chain industries, retail and services, and many more. Based on recent works in applied mathematics and machine learning, the PI is to explore theory, algorithms, and applications of statistical procedures that are developed for distributed data and aggregated inference (i.e., distributed inference), with considerations on the storage, computational complexity, and statistical properties of the relevant estimators. The project will develop practical models, statistical theory, and computationally efficient and provably correct algorithms that can help scientists to conduct more effective distributed data analysis. Statistical properties of these methods will be thoroughly studied, including analysis of asymptotic properties, simulation studies in finite sample cases, and establishment of effectiveness in some real applications. PhD students will be involved in the research. Course modules will be developed and made available publicly.

View original record on NSF Award Search →