Career: Towards a Systematic Characterization of Model Explanations for High-Stakes Decision Making
Harvard University, Cambridge MA
Investigators
Abstract
As machine learning (ML) models are increasingly employed to make high-stakes decisions in real-world applications, it becomes crucial to ensure that the relevant stakeholders can understand and trust the functionality of these models. However, the increasing complexity and the proprietary nature of ML models make it rather challenging for stakeholders to understand the behavior of these models. Consequently, several methods have been proposed in recent years to explain the behavior of ML models in a human-interpretable fashion. These methods, however, adopt vastly different strategies to explain model behavior and often contradict each other. The increasing diversity of these explanation methods, coupled with the lack of systematic evaluation frameworks, have made it impossible to determine which methods are likely to be effective across different kinds of critical real-world applications. This project will build rigorous frameworks for systematically analyzing, evaluating, and comparing the reliability and utility of various state-of-the-art explanation methods across different real-world applications. The frameworks developed as part of this project have the potential to significantly accelerate the adoption of ML models in a variety of settings, including healthcare (e.g., patient treatment recommendations), lending (e.g., loan approval decisions), and hiring (e.g., resume screening). This project aims to systematically characterize existing explanation methods so that practitioners can readily determine which methods to employ in a given real-world application. The project will focus on the following subtasks: 1) developing novel theoretical and empirical frameworks to analyze how reliably various state-of-the-art methods explain the behavior of different types of ML models (e.g., linear vs. non-linear models), 2) conducting large-scale user studies with domain experts in healthcare, lending, and hiring to evaluate the utility of existing explanation methods across different high-stakes applications, 3) leveraging the data obtained from the aforementioned user studies to build algorithmic agents which can mimic the behavior of domain experts, and employing these agents to, in turn, evaluate the utility of explanation methods at scale, and 4) developing novel algorithmic frameworks which can automatically select an appropriate explanation method tailored to a given real-world context. With these contributions, this research will pave the way for a clearer understanding and a broader consensus on which explanation methods are likely to effectively explain the behavior of different ML models across various high-stakes applications. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →