GGrantIndex
← Search

Statistical Models and Analysis of Complex Genomic Variation in Clonal Mixtures

$258,774R01FY2014HGNIH

University Of Pennsylvania, Philadelphia PA

Investigators

Linked publications & trials

Abstract

DESCRIPTION (provided by applicant): Next generation DNA sequencing (NGS) approaches are widely used in studying human diseases and identifying causative genetic variants. Increasingly, NGS methods are being used to define biologically relevant clonal mixtures, a frequently observed phenomenon in human disease. Examples of clonal mixtures in human disease include tumor cell subpopulations that are a part of cancer. Within a single tumor and clearly evident in metastatic tumor sites, cancer cell clonal populations exist, are genetically distinct and carry their own unique set of somatic variants. A similar phenomenon occurs in viral infection where multiple viral quasispecies are harbored within an infected individual; each quasispecies has their own unique set of genetic variants. One can quantitatively measure expansions or shrinkage in clonal populations as seen in changes in allelic representation of clonal variants. Specific cellular phenotypes are attributable to the unique clonal variants and changes in their representation can be indicators of evolutionary processes. This is frequently the case for drug resistance in cancer and viral infections. Thus, clonal genetic variation has major implications for the pathogenesis of human disease and is increasingly being tested as a longitudinal indicator of disease progression and treatment resistance. The general availability of whole genome and deep targeted resequencing provides an opportunity to conduct systematic analysis of heterogeneous DNA mixtures that have different clonal components. However, in many cases the genetic variant of interest is present at very small proportions (< 5%) and this makes the delineation of these clonal variants exceeding difficult. Many of the widely employed NGS analysis methods are optimized for detecting normal diploid genome variation. These approaches are not optimal for delineating genomic variants from complex clonal mixtures. Some genomic DNA variant classes such as genomic rearrangements are extremely difficult to detect in the context of clonal mixtures. To improve the assessment of clonal variation and evolution of specific clonal populations, we will develop innovative models and robust, sensitive statistical procedures. These methods will enable one to deconvolute genomic variation in clonal mixtures and consider clonal alterations through time and space. We will focus on improving the delineation of complex variations such as genomic rearrangements and other structural variations in genetic mixtures. To develop our methods, we will use heterogeneous DNA sequence data sets with in silico spike in variants and consider the lowest threshold of detection that we can achieve with the best sensitivity and specificity. Subsequently, we will test these methods on NGS data sets from clinical samples, delineate clonal populations based on unique variants and consider quantitative changes in allelic representation as seen in clonal expansion. These samples will be subject to whole genome and targeted resequencing. Cancer relevant samples will include tumors with matched normal, primary and metastatic DNA. We will consider viral quasispecies for a set of clinical samples where we have matched viral nucleic samples obtained longitudinally over the course of infection from a single individual. As a final milestone, we will release our methods as open source software for the biomedical research community.

View original record on NIH RePORTER →
Statistical Models and Analysis of Complex Genomic Variation in Clonal Mixtures · GrantIndex