GGrantIndex
← Search

Statistical Methods for imputation and genome wide association studies

$37,542F31FY2017HGNIH

University Of California Los Angeles, Los Angeles CA

Investigators

Abstract

Project Summary/Abstract Genome-wide association studies (GWAS) mine vast amounts of genomic data to detect correlations between markers and traits. Datasets gathered from different genotyping platforms invariably contain a significant fraction of missing genotypes. Genotype imputation fills in the missing genotypes. Unfortunately, imputation is computationally slow and prone to Mendelian inconsistencies when applied to family data. Most imputation methods also require large haplotype reference panels and phased data. A related problem is that standard GWAS analysis methods ignore haplotype structure. By including haplotype information in the form of ?haplosnps,? short sequences of single nucleotide polymorphisms (SNPs) located on the same chromosome strand, additional associations related to long-range genomic interactions can be detected. I have developed a fast and accurate genotype imputation matrix completion program in Julia that employs an accelerated Nesterov gradient method. This method also applies a post-processing projection to Mendelian consistency, as well as a fast reference panel based haplotyping option. I will add an option for haplotype estimation without a reference panel. This will provide the set of tools necessary for preparing raw sequence data to be used for haplosnp GWAS analysis, which I will develop in Julia.

View original record on NIH RePORTER →