CLUSTERING AND SEARCH FOR INFORMATION SYSTEMS USED BY BIOLOGISTS

$1,094P41FY2011RRNIH

Carnegie-Mellon University, Pittsburgh PA

Investigators

Linked publications & trials

Paper 36360278 Paper 35980576 Paper 35037852 Paper 32078570 Paper 31928911 Paper 30770471 Paper 30768618 Paper 30208027 Paper 29807068 Paper 29340712 Paper 29111975 Paper 28711156 Paper 28591602 Paper 28498103 Paper 28487483 Paper 28117658 Paper 27852862 Paper 27616867 Paper 27586284 Paper 27477444 Paper 27087801 Paper 27018655 Paper 26712581 Paper 26566294 Paper 25864451 Paper 25750583 Paper 25652927 Paper 25640641 Paper 25613439 Paper 25553823 Paper 25484487 Paper 25449739 Paper 25445671 Paper 25309967 Paper 25297958 Paper 25285322 Paper 25257021 Paper 25215768 Paper 25116421 Paper 24795172 Paper 24779031 Paper 24709600 Paper 24688659 Paper 24518260 Paper 24435020 Paper 24332165 Paper 24313792 Paper 24137667 Paper 24032517 Paper 24027610 Paper 24007457 Paper 23891882 Paper 23826091 Paper 23790384 Paper 23607565 Paper 23426110 Paper 23382875 Paper 23379664 Paper 23339564 Paper 23292636 Paper 23278450 Paper 23274692 Paper 23178122 Paper 23159228 Paper 23102681 Paper 23076044 Paper 23000369 Paper 22978431 Paper 22958375 Paper 22926267 Paper 22875861 Paper 22843728 Paper 22771232 Paper 22734487 Paper 22705388 Paper 22641475 Paper 22623398 Paper 22588133 Paper 22468611 Paper 22248573 Paper 21991539 Paper 21813687 Paper 21745372 Paper 21683051 Paper 21574563 Paper 21574558 Paper 21452901 Paper 21390124 Paper 21256056 Paper 21243078 Paper 21143936 Paper 21085574 Paper 20939567 Paper 20824148 Paper 20809638 Paper 20684632 Paper 20672015 Paper 20618850 Paper 20593474 Paper 20570399

Abstract

This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. Primary support for the subproject and the subproject's principal investigator may have been provided by other sources, including other NIH sources. The Total Cost listed for the subproject likely represents the estimated amount of Center infrastructure utilized by the subproject, not direct funding provided by the NCRR grant to the subproject or subproject staff. Spectral clustering is an effective and elegant clustering method based on the pairwise similarity between objects. Recently I have developed a fast and simple spectral-clustering like technique called power iteration clustering. As in spectral clustering, points are embedded in a low-dimensional subspace derived from the similarity matrix for the data points;however, while in spectral clustering, the subspace is derived from the bottom eigenvectors of the Laplacian of an affinity matrix, in our proposed method, the subspace is an approximation to a linear combination of these eigenvectors. The new method obtains comparable or better clusters than existing spectral methods, but is extremely scalable, and well-suited to parallel processing on a cluster machine (such as codon or warhol). I would like to explore use of this clustering method for information spaces associated with biologists( personal information needs;this is a project funded by NIH, but one without extensive computational resources as the moment. For more information on the technique, see http://www.cs.cmu.edu/~wcohen/postscript/nips-2009-pic.pdf For more information on the project, see http://www-2.cs.cmu.edu/~wcohen/querendipity/

View original record on NIH RePORTER →