Rates and patterns of recurrent structural variation in the mouse genome
University Of Virginia, Charlottesville VA
Investigators
Linked publications & trials
Abstract
DESCRIPTION (provided by applicant): Whole genome sequences and high-resolution tools have revealed that mammalian genomes have a complex and highly variable physical structure. We now recognize that structural variation (SV), defined as differences in the copy number, orientation or location of genomic segments larger than Ikb, are prevalent in mammals. At least ~1,000 SVs distinguish the genomes of two humans, these affect an abundance of genes, and SVs are increasingly found to underlie normal and disease phenotypes. SV is of special relevance to our understanding of evolution and disease because single mutational events can affect large phenotypic changes, and because structural mutation rates vary dramatically between "hotspots" and "coldspots". We are only in the very early stages of understanding how structurally plastic genomes truly are, and why they are this way. High-throughput DNA sequencing methods now allow us to routinely obtain millions of paired-end sequence reads, providing extraordinarily meaningful information about genome structure. We have developed sophisticated tools to reconstruct genome architecture by paired-end sequencing. A novel aspect of our methods is that they are designed to Identity SV throughout the entire genome, including structurally complex and/or repetitive regions. To investigate the extent and origin of SV in general, and of hotspots in particular, we will apply paired-end DNA sequencing to the genomes of 3 independently-bred colonies from 4 different inbred mouse strains. These 12 lines represent the full breadth of inter-strain genetic variation, as well as ~2,000 generations of spontaneous intra-strain mutation. We will discover SVs, identify those that result from recurrent structural mutation at hotspots, and characterize hotspots for clues as to their origin and function. Our results will address three fundamental questions: 1) How much SV exists among the genomes of inbred mouse strains? 2) How structurally plastic are mammalian genomes over short time scales? and 3) What is the contribution of recurrent structural mutation to natural variation? In order to address the above questions, as well as to provide our computational tools to the broader genomics community, we will continue to develop and refine our existing SV discovery algorithms. Our ultimate goal is to create a software package that can rapidly process large amounts of paired-end sequence DNA with minimal computational resources. In collaboration with William Pearson, we will write a single algorithm that combines read mapping and SV discovery into one highly efficient process, thus overcoming current analysis bottlenecks. This software will allow small genomics labs to take advantage of powerful new sequencing technologies. Our results will yield a basic understanding of the dynamics of structural genomic variation in the mammalian germline, and will enable investigation of the causes and mechanisms of new mutation. This is a subject of paramount significance to human health given the emerging role of structural mutation in the etiology of inherited and spontaneous human diseases, including autism and schizophrenia.
View original record on NIH RePORTER →