Statistical Methods For Gene/environment Interaction And Genetic Susceptibility
National Institute Of Environmental Health Sciences
Investigators
Linked publications & trials
Abstract
Identification of causative SNPs in a genome-wide study can be challenging when individual SNPs have small marginal effects because, to avoid excessive false positive conclusions, testing thresholds must reflect the large number of SNPs under study. For complex diseases, particular combinations of SNPs may dramatically increase disease risk through epistasis or gene-gene interactions. We are currently investigating the use of a machine learning technique with case-parents data for the discovery of sets of SNPs that together cause disease (causative SNPs). First, we devised a way to use actual case-parent triad genotypes to create simulated genome-wide data sets that reflect realistic linkage disequilibrium structure and to seed them with known sets of causative SNPs. Second, we implemented an existing stochastic search algorithm (called GA-KNN) that is based on an evolutionary algorithm to find multiple sets of d SNPs that are predictive of disease (here d is a small number, say 2 to 6). By cataloging those SNPs which appear most frequently among the sets that are predictive of disease, we hope to uncover the sets of causative SNPS. On simulated data seeded with multiple sets of interacting SNPs, our approach recovers the interacting sets of SNPs. This year we have refined our algorithm to increase its speed and accuracy. The sets of d SNPs that our algorithm nominates may increase disease risk jointly by having independent marginal effects, that is, without being involved in epistatic interactions. We have developed and evaluated a permutation test procedure to probe whether or not the risk increase attributed to a nominated set of d SNPs arises from epistatic interactions. We have also applied our algorithm to publicly available data on oral clefting to nominate potentially epistatic sets of SNPs. In ongoing work, we are modifying our algorithm to uncover sets of SNPs whose epistatic effects may differ between levels of an environmental factor. (see also Z01 ES040007; PI Clare Weinberg; Min Shi is also a within-lab collaborator on this project; her time is allocated in Weinberg's project but not in this one.)
View original record on NIH RePORTER →