Algorithm-based prevention and reduction of differences in cancer outcomes arising from data imbalance
University Of Tennessee Health Sci Ctr, Memphis TN
Investigators
Linked publications & trials
Abstract
A long-term cumulative data imbalance exists in biomedical research and clinical studies. This severe data imbalance is set to produce differences in cancer outcomes as data-driven, algorithm-based biomedical research and clinical decisions become increasingly common. Differences in cancer outcomes arising from data imbalance may affect all cancer types. The overall objective of this work is to obtain key knowledge and create open resources to establish a new paradigm for machine learning with clinical omics data. Guided by strong preliminary data, we will pursue two specific aims to 1) Discover from cancer clinical omics data and genotype-phenotype data: under what conditions and to what extent the transfer learning scheme improves machine learning model performance across population groups; 2) Create an open resource system for robust machine learning to prevent or reduce differences in cancer outcomes arising from the biomedical data imbalance. The approach is innovative because it represents a substantive departure from the status quo by shifting the paradigm from mixture learning and independent learning schemes to a transfer learning scheme. The proposed research is significant because it is expected to establish a new paradigm for robust machine learning and to provide an open resource system to facilitate the paradigm shift.
View original record on NIH RePORTER →