CAREER: Coding Theory for Robust Large-Scale Machine Learning

$508,271FY2019CSENSF

University Of Wisconsin-Madison, Madison WI

Investigators

Abstract

Coding theory has played a critical role in modern information technology by supporting robustness of information against a backdrop of multifaceted uncertainty. Following recent successes in machine learning, robustness has emerged as a desired principle, but now in the context of large-scale computation. Challenges related to robustness are prevalent when deploying machine learning solutions in real applications and non-curated settings, which are often non-ideal environments. This project aims to address these challenges by developing novel solutions based on coding theory for computation. These solutions offer provable robustness guarantees, can outperform more traditional solutions in practice, and extend to machine learning systems the gains that have transformed communication and storage systems. Existing and new collaborations of the investigator will facilitate industry cooperation and increase the transition to practice for the frameworks and algorithms generated from this project. The research will be strongly coupled with educational developments guided by recent advances in education science, alongside an outreach program within the Wisconsin Institute for Discovery. This project aims to develop novel coding-theoretic solutions and fundamental trade-offs for robust large-scale machine learning. The research program is centered around three thrusts. The first thrust focuses on robustness during distributed optimization in the presence of delays and straggler nodes, where the speed of convergence is affected by nodes in the system that are significantly slower than average. The second thrust focuses on robustness during distributed optimization in the presence of Byzantine nodes and worst-case failures. Recent studies proposed robust aggregation rules to filter out the effect of worst-case or adversarial failures. This project develops coding-theoretic solutions that can be orders of magnitude faster, and give rise to unexplored trade-offs between computation and Byzantine tolerance. The third thrust focuses on adversarial perturbations during prediction that can force state-of-the-art models to consistently mis-classify events/data. The coding-theoretic approach of this project pursues provable defense mechanisms against adversarial attacks through ensembles of models with inherent redundancy and through data augmentation. The proposed theoretical and algorithmic solutions are afforded by an interdisciplinary mix of tools from information and coding theory, distributed optimization, and machine learning. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →