Statistical, population genetics and genetic epidemiology

$733,001ZIAFY2021ESNIH

National Institute Of Environmental Health Sciences

Investigators

Linked publications & trials

Abstract

Increased availability of data and accessibility of computational tools in recent years have created an unprecedented upsurge of scientific studies driven by statistical analysis. Limitations inherent to statistics impose constraints on the reliability of conclusions drawn from data, and misuse of statistical methods is a growing concern. We have been developing tools for improving predictability of research findings using common measures of statistical significance. These methods operate on test statistics or P-values as summaries of data and also incorporate external or prior information for making inference about uncertainty in statistics or parameters of interest, such as P-values or risk of disease. In recent publications we develop meta-analytic (Vsevolozhskaya, Hu, Shi, Zaykin, 2020; Vsevolozhskaya, Hu, Zaykin, 2019) and approximate Bayesian methods that use information contained in P-values, but overcome their flaws and limitations (Vsevolozhskaya, Zaykin, 2020). These methods build on our continuing research concerning analysis of top-ranking genetic associations with disease with applications to detection of aggregated effects of multiple weak predictors on complex disorders. In the course of this research we have been developing approaches for discovery of genetic associations with disease using summaries of data, such as test statistics. Analysis based on summary statistics can be nearly as efficient as analysis of raw data and has some advantages, including ease of transfer, accessibility, and simplified extraction from publications. Combination of summary statistics across genetic variants yields measures of gene-wide associations useful in applications such as gene set analyses. Utilization of summary statistics for construction of Bayesian estimates and intervals improves resistance to the winner's curse and to selection-induced biases, including bias due to multiple testing. Our methods are being applied in collaborative projects with Dr. Gordenin's group to explore patterns of somatic mutations in cancer genomes. Additionally, we are working on methods for enriching variants that are likely to be involved in gene-gene or gene-environment interactions. We are extending methods for detecting variance quantitative trait loci for binary traits, such as disease status. Our work will be applied on data from the UK Biobank.

View original record on NIH RePORTER →