GGrantIndex
← Search

CLUSTERING BIOLOGICAL DATA USING MESSAGE PASSING

$38,262P20FY2009RRNIH

University Of Nebraska Medical Center, Omaha NE

Investigators

Linked publications & trials

Abstract

This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. Clustering algorithms are widely used in bioinformatics to classify data, as in the analysis of gene expression and in the building of phylogenetic trees. Biological data often describe parallel and spontaneous processes. To capture these features, we propose a new clustering algorithm that employs the concept of message passing. Message Passing Clustering (MPC) allows data objects to communicate with each other and produces clusters in parallel, thereby making the clustering process intrinsic. We have proved that MPC shares similarity with Hierarchical Clustering (HC) but offers significantly improved performance because it takes into account both local and global structure. We have analyzed 35 sets of simulated dynamic gene expression data, achieving a 95% hit rate in which 639 genes out of a total 674 genes were correctly clustered. We have also applied MPC to a real data set to build a phylogenic tree from aligned mycobacterium sequences. The results show higher classification accuracies as compared to traditional clustering methods such as HC.

View original record on NIH RePORTER →