Efficient and Private Decentralized Bayesian Learning

$299,998FY2023ENGNSF

Oklahoma State University, Stillwater OK

Investigators

Abstract

This project will promote the progress of science, and advance the national health, security, and prosperity by conducting fundamental research to enhance efficiency, robustness, and privacy of machine learning algorithms. Recent advances and maturation in computational technologies and infrastructure have enabled massive data collection with lower costs, for example, through smart devices, sensor networks, and Internet of Things. Consequently, datasets used for modeling, learning, and decision making are becoming more and more distributed. Data-driven models aggregating information from distributed datasets provide unrivaled capabilities in prediction and decision making over models learned from individual datasets. However, centralized processing of distributed datasets requires transferring a large amount of raw data to a central entity, incurring concerns on communication bandwidth and privacy. In addition, models learned by existing decentralized optimization techniques are likely to suffer from poor generalization and overconfident decisions, particularly when the data is insufficient or corrupted by noise. This project will develop a theoretical framework to create efficient and private decentralized Bayesian learning algorithms that produce robust models and decisions in the presence of insufficient and noisy data. Desired research outcomes will greatly improve machine learning techniques for processing distributed datasets and promote the progress of national priorities in data science, cyber-physical systems, and high-performance computing. The education and outreach activities will raise more awareness of machine learning in engineering to the younger generation and underrepresented groups and stimulate prospective students to pursue degrees and careers in science and engineering. This project explores the challenging problem of decentralized Bayesian learning for multi-agent systems. Decentralized Bayesian learning provides a principled, rigorous framework to process noisy datasets and learn models with uncertainty measures in a fully distributed fashion. The proposed research is expected to create a theoretical framework to design and analyze decentralized Bayesian learning algorithms via gradient-based Markov Chain Monte Carlo methods. Furthermore, protocols for enhancing communication and computational efficiency of the algorithms and for guaranteeing their privacy properties will be investigated and identified. The effectiveness of the proposed algorithms will be established based on scientific machine learning applications and engineering problems relevant to modeling and controls. Once validated, the proposed algorithms can become standard tools for solving a wide spectrum of scientific and engineering problems when data is distributed across a network. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →