GGrantIndex
← Search

Genotype and Histological Phenotype Relationships in Cancer, with Automated Therapy Optimization.

$44,524F31FY2018CANIH

Weill Medical Coll Of Cornell Univ, New York NY

Investigators

Linked publications & trials

Abstract

PROJECT SUMMARY / ABSTRACT Modern digital pathology departments produce a tremendous amount of whole slide image data, which is quickly growing to petabyte scale. This plethora of data presents an unprecedented treasure chest for all kinds of medical machine learning tasks, including improvements in precision medicine. Unfortunately, the vast majority of digital slides are not annotated on the image level, so histological points of interest are not integrated with the clinical notes or genetics associated with a patient. In contrast to other disciplines, manual labeling is not only cumbersome and time-consuming, but given the decades-long training of a pathologist, it is exorbitantly expensive and, due to clinical time constraints, impractical. Moreover, the time- dependent relationship between a patient's histology and genotype is not quantitatively leveraged to recommend combination therapies. Genetics informs us of important driver mutations, but how multiple cell types interact with these mutants over time in the tumor microenvironment to become histologically evident is less clear. Deep learning synthesizes generations of pathologist knowledge as accurate quantitative models. Given a picture of a patient's morphology, I provide a tool that in four seconds finds the top ten most similar patients with their diagnoses, to support a pathologist's diagnosis decisions under the time pressures of active surgery. Recording the pathologist inspecting a slide at the microscope automatically annotates observed slide regions with time. Not only amenable for learning models that predict whether or not a region is salient to a pathologist making a diagnosis, this also allows all slides in a hospital to be annotated to identical criteria with only a representative sample of slides. I have submitted a manuscript reporting 85.15% accuracy in this saliency pre- diction task. This annotation greatly simplifies machine learning tasks, which can now focus on a non-redundant set of diagnostic regions in the slide, whether the application is to (a) find similar patients by morphology for diagnosis or (b) relate diagnostic morphology to the genetics. Statistically modeling the relationship of the genotype to the histological phenotype in cancer opens promising new avenues in precision medicine. Taking a Big Data approach, I will leverage over 18,244 paired genome-histology samples to learn this model, using transfer learning techniques to maximize the value of all 18,244 samples for each tissue type. Genotype-phenotype model in hand, I will simulate the molecular clock in cancer, incrementally mutating the genome and predicting corresponding histology at each molecular time step. Through similar Q-learning that powers Google's champion artificial intelligence ``AlphaGo'', I will learn an agent that inhibits expression of a small set of mutant genes to maximize cancer progression-free survival time, by molecular time in the simulator. This not only measures the therapy's evolutionary durability, but also leads directly to experimentally testable hypotheses in 3D cell culture.

View original record on NIH RePORTER →
Genotype and Histological Phenotype Relationships in Cancer, with Automated Therapy Optimization. · GrantIndex