GGrantIndex
← Search

High throughput functional annotation of the impact of human UTR sequence variation on gene function and protein output

$755,420R01FY2025HGNIH

Yale University, New Haven CT

Investigators

Abstract

SUMMARY Sequencing studies have identified a large set of variants in the human genome, the majority of which fall outside protein coding sequences. Decoding the significance of these variants in gene function represents a major challenge. Untranslated regions of mRNAs play an important role in gene regulation, and single nucleotide changes in these regions can have remarkable effects on protein expression and gene function. The goal of this proposal is to develop new and generalizable genomic and computational approaches to understand the functional effect of genetic variation in the 5’ and 3’ UTRs and its impact on protein output and gene function. We propose two aims to achieve this goal. First, we will identify all the variants in the human 5’ and 3’ UTRs, characterize their conservation and association with disease, and develop a high throughput parallel method (NaP-TRAP-seq) to quantify the regulatory activity of each variant on mRNA translation and protein output across different cell types (Aim 1). This aim will generate extensive quantitative data for the effect of each UTR variant on protein output. Second, we will develop a computational model to predict the impact of each variant on protein output and develop a new constraint model to analyze how variant conservation relates to the effect on protein output and translation (Aim 2). This model will predict the functional effects of known and novel variants on the translation of the downstream coding sequence and how sequence variants affect gene function. Together, these aims will provide a generalizable method to understand the regulatory significance of each variant in the UTR, will provide the principles to interpret variants of uncertain significance in the UTR, and will help predict the functional consequences of sequence variants across trait- and disease-associated genes.

View original record on NIH RePORTER →