Computational systems enhancing the understanding of genomic and epigenomic features that define and differentiate chromatin environments at promoters and enhancers
National Institute Of Environmental Health Sciences
Investigators
Linked publications, trials & patents
Abstract
With the advent of next generation sequencing (NGS) a diverse variety of techniques for whole-genome characterization of biological processes has emerged. These techniques allow for interrogation of genomic sequence (DNA-seq), DNA accessibility (DNase-seq), DNA-protein interactions (ChIP-seq), and RNA expression (RNA-seq) amongst other biological properties. Although valuable on their own, integration of these approaches provides a fuller picture of highly coordinated biological processes, such as gene regulation. However, NGS data remain inaccessible to many life scientists without requisite expertise. There is a distinct absence of tools and resources accessible to life scientists that allow for informative integrated integration of NGS data. To address these issues, we have developed several resource supporting integrative omics. These tools allow users to integrate multiple datasets through analysis anchored on a user-defined feature list of genomic features and support machine learning and social network-based enhanced recommendation and classification. It has long been understood that transcription of genes initiates from promoters, which provide a platform for assembly of transcriptional proteins. The chromatin environment at mammalian gene promoters, consisting of positioned nucleosomes displaying functional histone modifications, is a key player in gene regulation, contributing to gene activation and repression in a variety of regulatory mechanisms. In addition to sense-strand transcription of gene sequences, antisense transcription is prevalent at promoters. Often resulting in a short-lived non-coding RNA transcript, the function of antisense transcription is poorly understood. Mechanistic understanding these processes requires integration of multiple diverse NGS datasets. Employing our new computational resources, we characterized transcription in mammalian cells and identified correlations between antisense transcription and the chromatin environment at promoters. We further examined antisense transcription initiating from between nucleosomes regularly positioned downstream of gene transcription start sites and investigated if nucleosomes between sense and antisense transcription start sites display histone modifications associated with active gene promoters. We also examined if chromatin remodelers and other protein complexes responsible for creation and maintenance of the promoter chromatin environment are associated with this same region and thus potentially impart an important role for antisense transcription in the regulation of gene expression. We have further developed user-friendly web-based tools with backend infrastructures, data and metadata management systems, recommendation engines, and intuitive visual resources that make such analyses interoperable with other datasets and analyses and support scaling such analyses to tens of thousands of datasets.
View original record on NIH RePORTER →