Statistical Learning for Precision Medicine Based on Multi-Source Data
Stanford University, Stanford CA
Investigators
Linked publications & trials
Abstract
PROJECT SUMMARY/ABSTRACT The pursuit of tailored treatment strategies for individual patients remains crucial for enhancing cost- effectiveness in clinical practice. Despite advancements in statistical methodologies and machine learning, persistent challenges impede the progress and implementation of precision medicine in clinical practice. Limited sample sizes pose a significant hurdle in estimating individualized treatment effects on clinical outcomes, necessitating the utilization of information from multiple data sources. However, effective integration of such data requires appropriately addressing population heterogeneity, privacy constraint, and features alignment across datasets. Furthermore, even with a group of well-developed prediction models of different complexity in place, there is still a need to devise smart strategies for adaptively employing them in practice. Lastly, addressing treatment effect heterogeneity in clinical trials remains challenging, particularly in efficiently synthesizing information from both discovery and validation stages without introducing bias. Our proposal aims to develop innovative solutions to aforementioned problems. First, we will introduce a novel transfer learning approach to accommodate overlapping but non-identical prediction feature sets in source and target populations. Second, we will develop a latent class model leveraging knowledge graph information from multiple sources for flexible feature alignment. Third, an innovative dynamic prediction strategy will be created to optimize the sequence of acquiring prediction features, thereby enhancing prediction accuracy while minimizing measurement cost. Fourth, we will extend reinforcement learning at a single site to federated learning setting under privacy constraints so that adaptive strategy such as personalized dynamic treatment regimen can be better developed. Lastly, we will propose a comprehensive framework for integrating information from both discovery and validation stages in studying the treatment effect heterogeneity, enabling unbiased inference of treatment effects among a selected subgroup of responders. All methodological developments will undergo rigorous numerical studies and real-data applications, ensuring their effectiveness, and will be disseminated widely to benefit the clinical community.
View original record on NIH RePORTER →