Leveraging identity-by-descent information in large-scale population sequencing

$337,610R01FY2014MHNIH

Icahn School Of Medicine At Mount Sinai, New York NY

Investigators

Linked publications & trials

Paper 27533299 Paper 26915512 Paper 26196440 Paper 24739678 Paper 24463508 Paper 24463507

Abstract

DESCRIPTION (provided by applicant): This project aims to develop ways in which the patterns of shared ancestral gene-flow for specific chromosomal segments can be inferred between seemingly-unrelated individuals and used to empower analyses of rare mutations discovered by sequencing, with respect to association with diseases such as schizophrenia. Identity-by-descent (IBD) implies that two or more individuals each carry an extended stretch of haploid sequence that is a direct copy, or descendant, of a single, ancestral haplotype that resides (or once resided) in a recent common ancestor of those individuals. In large samples it is not unusual to find many thousands of instances in which seemingly unrelated individuals are, for some fraction of their genome, related exactly as closely as are parent and offspring. In the context of large, population-based studies of rare and common genetic variation, we propose that layering a map of intra-individual IBD sharing on top of datasets of rare mutation and polymorphism from sequencing can help in the daunting challenge of relating genetic variation to risk for common disease. Specifically, we propose to use IBD sharing information in sequencing studies to 1) identify likely de novo and very recent (private) mutations, 2) prioritize rare variants for likely functional impact and 3) allow additional un-sequenced samples to prioritize rare alleles according to the likelihood they are causal given their IBD sharing with sequenced individuals. We will apply the methods developed here to two large schizophrenia sequencing studies, with whole-exome data on over 6,000 individuals and genome-wide SNP data on over 14,000. The statistical approaches developed here will be implemented and distributed as part of the PLINK/Seq software package.

View original record on NIH RePORTER →