CAREER: Concise descriptors of genomic data facilitate mechanistic inference
University Of Pittsburgh, Pittsburgh PA
Investigators
Abstract
Biological systems are more than the sum of their parts. Modern high throughput technologies enable scientists to study biological systems holistically by measuring thousands to millions of molecular measurements in a single sample. This new scientific paradigm has accelerated our understanding of complex biological processes such as embryonic development and cancer. However, sophisticated analytical frameworks are needed to turn large datasets into concrete biological insights. This project aims to build general and flexible algorithms that can reduce large collections of data into smaller biologically meaningful representations. These representations are concise enough to be easily manipulated by non-experts yet rich enough to support both the originally intended analysis and future data reuse. The project will also develop teaching materials and hand-on activities to educate the next generation of biological data scientists across diverse education levels and technical backgrounds. This project will develop a highly customizable Bayesian framework that combines many approaches to interpretable dimensionality reduction into a single unifying framework by using flexible mixture distribution priors. Additionally, drawing on rich sources of biological prior knowledge the proposed method will be capable of both quantifying and identifying by name specific biological processes embedded in high dimensional data. In a separate but synergistic goal the project will adapt state-of-the-art segmentation algorithms to automatically extract genomic features from multi-sample, base-resolution DNA methylation datasets accelerating downstream analysis and interpretation. The results of this work can be found at http://chikinalab.org/. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →