Career: Building Models that Avoid Spurious Correlations through Interpretability and Representation Learning

$546,682FY2022CSENSF

New York University, New York NY

Investigators

Abstract

Advances in artificial intelligence (AI) modeling have allowed AI to uncover and use all kinds of information to make accurate predictions. These predictions touch our day-to-day life, for example through fitness wearables that monitor health. Sometimes the predictions made by AI models make use of information that is unstable or spurious. For example, using sand to classify whether an image contains a camel versus a cow would be incorrect when presented with a camel in a grassy field. Examples of AI models failing because of the use of spurious information exist in other domains such as healthcare, where AI models can make predictions on the basis of how the data was collected rather than on the physiological information in the data. This project aims to develop tools to both help identify when AI models make use of spurious information and tools to build better AI models that avoid the use of spurious information. The results of this project will be algorithms that are applicable across several types of data and domains. The project will foster the development of undergraduate and PhD students through new lectures on AI for the real world and will promote data literacy through visualizations of AI models made by the tools the project will develop. There are two technical thrusts in this project. The first thrust focuses on interpretability of AI models. This thrust seeks to develop methods that can help identify the use of spurious information and, given knowledge of spurious information in an input, can help identify the semantic information in an input useful in predicting a label. This thrust will adapt the concept of learning to explain, which seeks to train a function to highlight the important part of an input for predicting a label, to the task of identifying and downweighing spurious information. The second thrust constructs new representation learning algorithms for building models that avoid the use of spurious information. Spurious information consists of relationships between variables that change across a family of data generating distributions. This thrust will seek to study the limits of reweighting-based estimators and flexible models and the utility of stronger assumptions, such as the existence of a residual. It will also study assumptions needed to address violations of positivity and how to do representation learning that avoids spurious information in multimodal data. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →