CPA-ACR: Parallel Algorithms and Software for Large Scale Microarry Data Analysis and Gene Network Inference
Iowa State University, Ames IA
Investigators
Abstract
High-throughput gene expression profile measurements enabled by microarrays have spawned significant advances in functional genomics and systems biology. The vast numbers of cumulative microarray experiments conducted over the past decade have generated a wealth of expression data available from several public repositories. While algorithms for microarray data analysis and gene network inference have been well studied, most available methods and programs are sequential and cannot scale up to analyzing large number of experiments due to both memory and time constraints. In this project, the investigators will develop high performance, parallel computational methods for large-scale gene expression analysis and gene network inference utilizing tens of thousands of microarray experiments available in public repositories. The primary research goal is to develop capability to simultaneously analyze the entire gamut of gene expression data available for an organism, and make biological discoveries and build robust, accurate networks which would not be possible through limited, compartmentalized analysis. The research will be carried out using gene expression profiles of the plant Arabidopsis thaliana, a well-studied model organism and the focus of the decade-long NSF Arabidopsis 2010 initiative. The investigators will develop 1) parallel algorithms for biclustering large gene expression matrices, 2) parallel algorithms for inferring gene networks using Mutual Information and Bayesian approaches, and 3) methods for querying and analyzing large-scale biological networks. The project will be led by an interdisciplinary team of investigators whose expertise spans parallel algorithms, scalable computing and software development, bioinformatics and systems biology, statistical analysis, microarray experimental techniques and analysis, and organism specific knowledge of Arabidopsis. It will lead to the development of advanced computational methods and open source software programs in systems biology.
View original record on NSF Award Search →