An Accurate Machine Learning Framework for Childhood Acute Myeloid Leukemia Subtype Identification by Integrating Bulk and Single-Cell Multi-Omics Data Within and Beyond the CCDI Ecosystem

$500,000P30FY2023CANIH

University Of Nebraska Medical Center, Omaha NE

Investigators

Linked publications & trials

Paper 39684445 Paper 39678719 Paper 39605468 Paper 39580490 Paper 39573886 Paper 39456534 Paper 39450301 Paper 39436320 Paper 39420209 Paper 39386613 Paper 39386448 Paper 39384153 Paper 39339419 Paper 39224072 Paper 39219576 Paper 39151112 Paper 39141488 Paper 39130636 Paper 39125017 Paper 39110778 Paper 39107463 Paper 39080401 Paper 39013847 Paper 38974465 Paper 38969056 Paper 38946943 Paper 38900505 Paper 38853862 Paper 38847506 Paper 38775157 Paper 38770637 Paper 38733583 Paper 38712184 Paper 38704607 Paper 38687198 Paper 38683788 Paper 38672426 Paper 38617282 Paper 38603627 Paper 38547055 Paper 38542184 Paper 38514303 Paper 38482976 Paper 38445903 Paper 38437585 Paper 38429478 Paper 38421730 Paper 38405788 Paper 38397158 Paper 38394685 Paper 38388663 Paper 38381773 Paper 38373465 Paper 38347948 Paper 38303112 Paper 38284649 Paper 38278978 Paper 38258111 Paper 38227647 Paper 38180583 Paper 38123510 Paper 38110836 Paper 38095415 Paper 38093839 Paper 38067133 Paper 38010872 Paper 38008263 Paper 38000343 Paper 37961833 Paper 37940068 Paper 37931287 Paper 37928187 Paper 37928181 Paper 37874141 Paper 37815870 Paper 37777555 Paper 37759490 Paper 37756579 Paper 37747235 Paper 37722652 Paper 37689825 Paper 37628698 Paper 37626854 Paper 37614231 Paper 37478915 Paper 37474760 Paper 37466902 Paper 37454130 Paper 37357909 Paper 37283496 Paper 37243489 Paper 37074924 Paper 37062329 Paper 37037900 Paper 36997106 Paper 36982565 Paper 36967186 Paper 36963650 Paper 36923592 Paper 36872587

Abstract

Abstract As a fatal childhood hematopoietic malignancy characterized by clonal expansion of immature myeloid precursors, acute myeloid leukemia (AML) usually leads to bone marrow failure and impaired hematopoiesis. AML has multiple distinct subtypes characterized by morphological, molecular, and genetic alterations. Identifying AML subtypes can facilitate downstream risk stratification and tailored treatment design. While various conventional methods like morphological analysis, cytogenetic analysis, immunophenotyping, or molecular profiling have been used for AML subtype identification, they are usually costly, time-consuming, labor-intensive, and sometimes inaccurate. Recent progress has witnessed the application of next generation sequencing (NGS) for identifying AML subtypes, but they are limited to bulk NGS data, or single omics data only. With tons of omics data being generated within and beyond the Childhood Cancer Data Initiative (CCDI) ecosystem, we hypothesize that integration of single-cell and bulk multi-omics data including genomics, transcriptomics, and epigenetics data will significantly facilitate subtype-specific biomarker discovery and boost the accuracy of AML subtype identification. Under our parent award (CA036727), in this supplemental project, we propose to develop an integrated machine learning (ML) framework for accurate and cost-effective AML subtype identification by combining bulk and single-cell multi-omics data within and beyond CCDI ecosystem. To achieve this, we plan to undertake two specific aims. In Aim 1, we will establish a knowledge-transfer ML model that leverages large-scale bulk and single-cell transcriptomics data for AML subtype identification. Besides identifying well-annotated AML subtypes, we will also explore novel AML subtypes by detecting rare cell types from large-scale single cell data, from which cluster-specific and rare-cell-type specific gene signatures can be transferred to the bulk transcriptomics data for improving performance of AML subtype identification. In Aim 2, we will develop a multi-kernel learning and a multi-modal deep learning framework to systematically and automatically integrate deep information related with AML subtypes from single-cell and bulk multi-omics data (including genomics, transcriptomics, epigenomics) to further boost AML subtype identification. Our model is flexible to tackle cases when only partial or incomplete multi-omics data are available for new patients. We believe successful completion of this study will have direct impacts on improving downstream childhood AML risk stratification, facilitating diagnosis and prognosis, and optimizing treatment selection. We also expect that our proposed framework in this study can be customized and extensible to identifying subtypes of other pediatric, adolescent, and young adult (AYA) cancers especially ultra-rare tumors.

View original record on NIH RePORTER →