GGrantIndex
← Search

CAREER: Advances in Modern Causal Inference: High Dimensions, Heterogeneity, and Beyond

$400,000FY2021MPSNSF

Carnegie Mellon University, Pittsburgh PA

Investigators

Abstract

Causality is at the heart of many of the most important questions in science and policy. Which cancer treatments are best for which patients? Does incarceration impede or encourage recidivism? Causal inference is concerned with formulating such questions mathematically, exploring whether answers can be obtained from data, and if so, determining how well and with what methods. The classical setup in causal inference explores effects of simple interventions, and presumes confounding relationships are straightforward enough to estimate with relatively low error. However, simple "all-or-nothing" effects may mask underlying heterogeneity, and can be practically unrealistic or nearly impossible to estimate, for example if some subjects have no chance of receiving treatment. Further, in modern contexts, confounders are often high-dimensional and relate to exposures and outcomes in unknown and possibly very complex ways. Accommodating realistic confounding and heterogeneity are two of the most central challenges in modern causal inference. These pursuits yield myriad open questions, from understanding fundamental limits of causal inference in high dimensions to exploring entirely new effects altogether. This project aims to help address these questions by developing novel theory and methods and advancing the application of causal inference in fields such as public policy and medicine. Outreach is also a major component. New software will be made freely available in R. The PI will design an undergraduate course on quantitative causal reasoning, to help push data literacy forward from association to causation. A textbook will be written, and there will be numerous opportunities for broad participation, including summer programs, workshops, and short courses. This project aims to develop new theory and methods for the study of more nuanced - yet practical - effect measures, accommodating the complex data structures often found in practice. The research will focus on (1) adjustment for high-dimensional confounding and (2) flexible estimation of heterogeneous treatment effects and optimal treatment regimes. Extensions will also be pursued for multivalued time-varying exposures subject to unmeasured confounding. For (1), the PI aims to develop novel non-asymptotic risk bounds for both classical and new propensity-based effects, as well as minimax lower bounds. This is accomplished in a high-dimensional discrete model (new to causal inference) as well as with continuous data. For (2), the PI plans to determine the fundamental limits of heterogeneous effect estimation in flexible nonparametric models, develop and analyze novel heterogeneous effect estimators, and study optimal treatment regimes under novel "contact constraints." This work has the potential to help transform our understanding of causal inference in the modern big data era. The projects will also directly contribute to research on specific applications in sociology, criminology, and medicine. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →