Collaborative Research: CIF: Small: Robust Machine Learning under Sparse Adversarial Attacks

$300,000FY2023CSENSF

University Of Illinois At Urbana-Champaign, Urbana IL

Investigators

Abstract

Machine-learning algorithms have proved successful in many applications, such as detecting handwriting, converting speech to text, detecting traffic signals for autonomous vehicles, or predicting a patient's diagnosis from medical data. A machine-learning model is usually "trained" to perform the designated task. This training is done by feeding many data samples to the model and using algorithms to adjust the model parameters so that its output is consistent with the provided training output most of the time. There are many challenges to performing this task reliably and efficiently. Recent research has shown that making small changes to the data points can lead to misdetection. Therefore, it is critical to make learning models robust against such data perturbations, especially in safety-critical applications such as autonomous driving. This project aims to achieve this for a specific category of data perturbations called "sparse attacks." Sparse-attack scenarios are those in which perturbations occur in only a few coordinates of the data, such as a few pixels in an image. Despite their importance and various real-world applications, sparse attacks have not been widely studied from a theoretical perspective. The goal of this project is to develop a theoretical framework for robust machine learning in the presence of adversarial perturbations that are bounded in L0 norm, or so-called sparse attacks. There have been significant theoretical studies on non-sparse adversarial attacks, but such fundamental understanding has been lacking for the sparse setting. This is partly due to the challenges in the L0 setting, namely, the L0 ball being non-convex and highly non-smooth. The first goal of this project is to study the fundamental limits of robust classification for stylized mathematical models. This will be done by proposing defense methods that are provably robust against L0 attacks, as well as proving converse results. Ideally, one aims to establish tight achievability and converse bounds asymptotically to fully characterize the optimal robust classifier. Motivated by practical considerations, the performance of the proposed defense methods in other scenarios will also be studied. In particular, this project explores the generalization properties of the proposed robust hypothesis class in order to study the effect of finite samples when the data distribution is unknown. Furthermore, the project consists of an evaluation plan to implement the developed defense mechanisms and analyze its performance in terms of learning a model which is robust against sparse attacks. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →