Statistical Methods For Genetic Epidemiology
National Institute Of Environmental Health Sciences
Investigators
Linked publications & trials
Abstract
Genome-wide association studies typically compare cases and controls for single SNP autosomal variants under the implicit assumption that heritable effects are secondary to inherited autosomal genetic variants. Four nonstandard genetic mechanisms could be involved as well: sex-linked traits, matermally-mediated effects where the mother influences the development of her fetus during gestation, and this influences later risk, mutations in the mitochondrial DNA, and parent-of-origin effects. Each of these nonstandard mechanisms can cause asymmetry in family history data, which can be studied even in the absence of any genotype data. In one project, we are estimating the extent of asymmetry that would be produced in family history data secondary to the existence of such mechanisms. We applied this strategy to family history data from our large study of women, each of whom had as sister diagnosed with breast cancer (the NIEHS Sister Study), and found evidence that maternal grandmothers of young-onset (under age 50) cases of breast cancer were more likely to have had breast cancer than were their paternal grandmothers. This observation suggests there may be maternally-mediated genetic risk factors for breast cancer, that there may be imprinted genes related to risk or that mitochondrial variants play a role. Epigenetics could also be important for breast cancer. A particularly important design we are now considering involves a tetrad structure, with one affected and one unaffected offspring, in addition to the two parents. This design has been implemented in the Two Sister Study (funded in part by Susan G. Komen for the Cure), which is assessing the joint role of genetic and environmental risk factors in young-onset (under age 50) breast cancer. The discordant sib pair allows estimation of effects of exposures, while the embedded case-parent triad allows detection of haplotypes that confer either protection or risk. The tetrad analyzed together should provide a powerful design for assessing gene-by-environment interaction. We have been working on developing and evaluating methods for use with the tetrad design. The Two Sister Study completed enrollment of nuclear families where one daughter developed breast cancer before age 50 and the other daughter is unaffected. This is described under a separate project. Inherited genotypes, together with tumor characteristics, will need to be explored to investigate factors that predict the clinical course following treatment, and improved statistical methods will also need to be developed in that context. We are undertaking a genome-wide association study based on these data through a contract with the Center for Inherited Disease Research at Johns Hopkins and will be able to explore gene-by-environment effects on risk of young-onset breast cancer and also look at maternally-mediated effects and possible parent-of-origin effects on risk. The genotype data are now here, and with augmentation by imputations carried out at the University of Washington, we now have some 20 million SNPs. The Illumina platform used was the human OmniExpress plus Exome array, and the use of the exome typing will impose the need to develop further methods appropriate for rare alleles. We also are participating in the GAME-ON consortium, which has provided additional SNPs from the newly developed onco-chip. We have begun analyses of these data by developing a risk score based on SNPs that have previously been replicated. In a methodologic extension to our earlier work on case-parent triads we are developing methods to account for parental phenotypes, and applying those methods to our Two Sister Study, in which some 20% of the mothers also had breast cancer. Together with a graduate student from UNC Biostatistics, Alison Wise, we are working on methods for identifying variants on the X chromosome related to risk. We are also working on assessing the performance of our new method by applying it to the Two Sister Study and to the DbGap data on oral cleft. Our method, the PIX-LRT, makes use of parental information in a robust way in addition to the transmission distortion, and thus makes more efficient use of the data than do existing methods. A paper on identifying risk-related variants on the X is almost ready to submit.
View original record on NIH RePORTER →