Collaborative Research: CNS Core: Medium: Data-Centric Networks for Distributed Learning

$550,000FY2021CSENSF

Northeastern University, Boston MA

Investigators

Abstract

Machine learning algorithms have revolutionized many fields by giving them the ability to use historical data for making predictions or detecting patterns that can then be used to automate various tasks and create new applications for users. The data that many of today’s machine learning applications require, however, is often collected by a network of multiple sensors. For example, data from environmental sensors in smart cities can be used to predict air pollution or traffic at different locations in the city. Analyzing this data with machine learning algorithms then requires these devices to cooperate with each other, exchanging data and models. This project designs mechanisms for devices to efficiently cooperate. Distributing machine learning algorithms is particularly challenging when devices are heterogeneously resource-constrained, e.g., with varying compute, power, or bandwidth limitations, as is often the case in today’s networks. Traditional learning algorithms either bring all data to a single location for analysis, or entirely distribute the learning algorithm to the data sources. A more flexible approach that instead intelligently brings data to the computing components of the learning algorithms, and conversely brings computing to data sources, can better harness these devices’ resources, but raises a natural question of how data and model components should be moved through the network. This project develops a data-centric approach to distributed learning that utilizes advances in Named Data Networking (NDN) to simplify the process of exchanging information, enabling new types of distributed learning algorithms. The outcomes of this project may improve the distributed learning in a vast number of potential applications, ranging from smart cities to satellite data analysis to augmented reality. The project also supports ongoing efforts in education and broadening participation in computing to underrepresented communities. These efforts include (i) development of new course materials that teach students about the challenges of realistic machine learning deployments, (ii) recruitment of high school and undergraduate students to work on suitably scoped projects that will contribute to the research vision, and (iii) presentations and mentoring sessions aimed at increasing the participation of underrepresented minorities in computing. This project is a collaborative effort between Carnegie Mellon University and Northeastern University. Results, including algorithm implementations, technical reports, and measurement datasets, will be made publicly available on a repository hosted by CMU. These will remain available for at least two years after the conclusion of the project. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →