Advanced Approaches for Integration and Analysis of Genomic Data
Suny At Buffalo, Amherst NY
Investigators
Abstract
The over-availability of data and the under-availability of knowledge present a critical challenge for biological informatics in the years to come. Clearly, effective techniques are need not only for storage and retrieval purposes, but also for mining genomic data to increase our knowledge. However, the high dimensionality and enormous size of genomic data pose very challenging problems in analysis and visualization of the data sets. This project investigates novel approaches to analyzing gene expression data and integrating them into biological research. New algorithms and tools that can be used iteratively and interactively to mine the data will be developed. The strategies include a meta data hierarchy for integration of heterogeneous data, cluster-based indexing for high-dimensional data, inter-dimensional analysis for classification, and dynamic interactive visualization for pattern analysis. The approaches will be field-tested by biologists investigating an organism's phenotype and genotype iterations. The project will deliver a flexible, scalable workbench environment ready to be used for general genomic data analysis. In addition to education development activities, the project's impact will be enhanced by broad applications in other fields that handle large-scale multi-dimensional data sets.
View original record on NSF Award Search →