CIF: Small: Communication-efficient and robust learning from distributed data
George Mason University, Fairfax VA
Investigators
Abstract
There is an increasing trend of allocating machine learning workflows over a distributed network of connected devices or data centers. For distributed data networks supporting big data applications, the communication cost of moving either data or model parameters among computing nodes has become a common bottleneck of all distributed machine learning algorithms. This project develops communication-efficient and robust techniques for distributed learning, particularly for decentralized networks in the absence of central coordination. The key idea is to enforce communication censoring, in which distributed nodes transmit their local updates infrequently based on autonomous assessment of the significance of local information changes. The outcomes of this research are expected to benefit a plethora of resource-constrained distributed learning applications, such as structural monitoring for critical infrastructure, location-aware services, Internet of Things, and mobile healthcare. The goal of this project is to develop communication-efficient and robust approaches to distributed stochastic optimization, for learning from locally stored private data in big data computing. A communication-censoring framework is introduced into the design of variance-reduced stochastic optimization techniques in order to effectively reduce message movement among distributed nodes, while globally optimizing a shared learning model with provable convergence, even in the absence of any central coordination or synchronism. Further, distributed robust aggregation techniques are developed to combat the impacts of malicious attacks, malfunctional nodes and transmission link failure, with added protection of data privacy. The developed theory and mechanisms on communication censoring and robust aggregation feature in key ideas for distributed nodes to collaboratively evaluate the informativeness of computing and jointly assess robust statistics without data sharing, even in the absence of central coordination. Rigorous analyses are conducted to delineate the convergence conditions, convergence rates, and tradeoff between efficiency and robustness. Such advances offer vital tools to propel the successful implementation of practical distributed machine learning systems in broad applications. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →