A Data Analysis Center for integration of fly and worm modENCODE datasets
Massachusetts Institute Of Technology, Cambridge MA
Investigators
Linked publications & trials
Abstract
DESCRIPTION (provided by applicant): The aims of the ENCODE (Encyclopedia of DNA Elements) and modENCODE (model organism ENCODE) projects are to apply high-throughput, cost-efficient approaches to generate a catalog of functional elements in the human, worm, and fly genomes, which will serve as the basis for biomedical research advances. By their smaller genome size, powerful genetics, and ease of experimentation, D. melanogaster and C. elegans can help guide the study of functional elements in the human genome, reveal new insights into global gene regulation and embryo development, and enable experimental studies of gene function and regulation which are not accessible in mammalian systems. This proposal aims to enhance the value of these datasets by creating a Data Analysis Center (DAC) to support, facilitate, and enhance integrative analyses of the modENCODE consortium in fly and worm, to achieve a high-resolution annotation of all their functional elements, and to reveal new insights into the biology and gene regulation of animal genomes including the human. We foresee four central roles for the DAC, and have organized our aims around them. Aim 1: We will provide common computational guidelines for data processing in fly and worm, a common computational infrastructure and pipeline for common analysis and statistical tasks. Aim 2: We will facilitate and carry out element-specific integrative analyses to identify diverse classes of functional elements based on combinations of relevant datasets coming from multiple groups. This includes (a) enhancers, promoters, insulators, and other regions of regulatory importance, (b) protein-coding and non-coding genes, (c) regulatory networks of transcription factor and microRNA targeting, and (d) sequence features predictive of diverse classes of functional elements. Aim 3: We will carry out exploratory data analyses across different data types to discover potentially novel correlations and insights relating diverse classes of elements. In particular we will apply dimensionality reduction techniques to coordinate-based genome-wide genomic and epigenomic datasets, we will apply clustering and bi-clustering methods to identify functionally related sets of genes and modules, and we will analyze structural and dynamic properties of discovered networks. Aim 4: We will carry out comparative analyses across the two model organisms, and also with yeast and human. We will provide an ortholog resource between the species, compare regulatory relationships and dynamics for orthologous cell lines and developmental points, and carry over biological knowledge across model organisms and human. To achieve these four aims, we will work closely with members of the consortium, the modENCODE Analysis Working Group (AWG), consisting of all Principal Investigators and analysis groups, and the Data Coordination Center (DCC), responsible for all data sharing within the consortium and with the larger worm and fly communities. PUBLIC HEALTH RELEVANCE: The aims of the ENCODE (Encyclopedia of DNA Elements) and modENCODE (model organism ENCODE) projects are to apply high-throughput, cost-efficient approaches to generate a catalog of functional elements in the human, worm, and fly genomes, which will serve as the basis for biomedical research advances. By their smaller genome size, powerful genetics, and ease of experimentation, D. melanogaster and C. elegans can help guide the study of functional elements in the human genome, reveal new insights into global gene regulation and embryo development, and enable experimental studies of gene function and regulation which are not accessible in mammalian systems. This proposal aims to enhance the value of these datasets by creating a Data Analysis Center (DAC) to support, facilitate, and enhance integrative analyses of the modENCODE consortium in fly and worm, to achieve a high-resolution annotation of all their functional elements, and to reveal new insights into the biology and gene regulation of animal genomes including the human.
View original record on NIH RePORTER →