Augmenting TOPMed WGS studies across the comprehensive spectrum of short tandem repeats (STRs).
University Of Michigan At Ann Arbor, Ann Arbor MI
Investigators
Linked publications, trials & patents
Abstract
SUMMARY The NHLBI TOPMed whole genome sequencing (WGS) studies are generating unprecedented scale of sequence reads, totaling >2 quadrillion bases and >300 million variants across >20,000 individuals. While >97% of accessible genomic regions are be exhaustively interrogated through existing variant calling methods, ~3% repeat-rich genomic regions are insufficiently interrogated due to limited ability to call short tandem repeats (STRs). Because ~50% short insertions and deletions (indels) are found in repeat-rich regions of genome, it is important to comprehensively call STRs to reach near-complete sensitivity to identify disease-causing variants from TOPMed WGS studies. In this application, we build on our record of developing innovative methods and analyzing petabytes of TOPMed WGS reads to generate comprehensive and accurate short variant calls, capitalizing on STRs, from TOPMed WGS studies. We leverage related and duplicated samples to improve the quality of STRs. We also propose to estimate mitochondrial DNA copy numbers and telomere lengths from the sequence data, and perform genome-wide association studies to demonstrate the power of the new STR-augmented callset.
View original record on NIH RePORTER →