GGrantIndex
← Search

CAREER: Learning Models for Scalable Content-Based Image Retrieval

$494,964FY2010CSENSF

Dartmouth College, Hanover NH

Investigators

Abstract

This project addresses the design of machine learning algorithms enabling content-based image retrieval in Web-scale collections of photos. This research formulates image retrieval as a binary classification problem: decide which database images are the "same" as the user-provided photo. Efficiency and scalability to large collections are achieved by constraining the classifiers to be models supported by traditional text-search engines, which perform real-time search in databases of several billion documents. In order to implement search based on high-level notions of similarity, the research team develops methods to automatically localize the most content-relevant regions in the input photo and to extract from them semantically powerful classifiers combining appearance cues with robust geometric constraints. The algorithms learn from user-provided labels indicating the presence but not the location of similar visual content, thus requiring a minimal amount of human supervision. This research investigates also how this advanced form of similar-image search can be used to organize personal photos, provide semantic annotations, and support content-based clustering of pictures. Furthermore, this work provides technical advances in a wide range of computer vision problems including object detection, visual saliency, and content-based clustering of photos. Moreover, the research team is collecting an unprecedentedly large image data set to evaluate the developed image retrieval system and to be available to the community. Research is naturally integrated with education and outreach by means of related courses and out-of-classroom activities aimed at attracting students to this field and at encouraging interdisciplinary collaborations.

View original record on NSF Award Search →