Dynamic treatment regimes via smooth surrogate loss: theory, methods, and computational aspects

$199,986FY2023MPSNSF

Texas A&M University, College Station TX

Investigators

Abstract

In complex diseases such as cancer, sepsis, or depression, patients often require treatments in multiple stages due to the dynamic nature of the disease. In such cases, physicians may need an algorithm or policy to switch to an alternative treatment option when necessary, in addition to prescribing the initial treatment. Furthermore, since patients respond differently to treatments, physicians need to personalize the treatment policy based on each patient's specific needs and profile. This project aims to utilize modern, flexible machine learning techniques and existing patient data to identify the optimal treatment policy, known as the optimal dynamic treatment regime (DTR), in such time-varying situations. Ensuring scalability to real-world electronic health record data with a large sample size will be a key focus. This project will also develop and distribute user-friendly open-source software and provide research training experiences for graduate students. Recent research has connected DTR policy learning to sequential classification problems, enabling the integration of machine learning techniques. However, currently, computationally efficient methods for solving the resulting classification problems are limited to specific cases, such as binary-treatment settings, and are prone to variance inflation. The first step of this project aims to demonstrate that the underlying sequential classification in any DTR policy learning problem can be provably solved via a smooth surrogate problem. This surrogate problem will be amenable to scalable machine-learning tools, such as stochastic gradient descent, facilitating fast implementation. In the second step, this project will combine the aforementioned machine-learning-based method with model-based techniques to construct a more stable hybrid estimator that is robust to potential model misspecifications. In addition, this project will comprehensively investigate the performance limits of the resulting methods by integrating classification theory, offline reinforcement learning, and the theory of nonparametric statistical inference. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →