Collaborative Research: CNS Core: Medium: Systems Support for Federated Learning
University Of Wisconsin-Madison, Madison WI
Investigators
Abstract
Traditional approaches toward applying machine learning techniques to end-user data often require copying all data to the cloud. This is not only expensive but faces data privacy risks as well. By analyzing data on the device where it is generated, federated learning aims to mitigate both cost and privacy concerns of centralized machine learning without sacrificing its benefits. This collaborative project brings together investigators from two institutions to develop building blocks for practical federated learning by addressing challenges arising from the diversity of user devices and the heterogeneity of data distributions in those devices. The project takes a three-pronged approach: (1) enable performance improvements for machine learning developers (e.g., judicious participant selection instead of randomly selecting participants); (2) provide efficiency improvements for service providers (e.g., redundancy elimination for data transfers); (3) enable end-users to control their data privacy (e.g., akin to app permissions in Android) without sacrificing device usability. Two core principles underpin these solutions: multi-tenancy both in the cloud and on individual devices; and maintaining theoretical correctness, convergence characteristics, and privacy/security guarantees of federated learning algorithms. Widespread adoption of practical federated learning can fundamentally change how we gather insights from end-user data and how users value data privacy, because users may not have to sacrifice privacy for convenience in many cases. This, in turn, can force large corporations to rethink their data collection and usage practices, and influence policy makers to consider stricter privacy regulations. All software from this project will be open source. Through outreach and new educational materials, this project will pioneer the training of privacy-aware systems builders. This collaborative project will produce software artifacts, experimental harnesses, benchmarks, and results of running those benchmarks and artifacts. These materials will be available for public use under permissive open-source licenses at multiple locations, including https://github.com/symbioticlab. They will be retained for at least three years after the completion of the project. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →