Hierarchal Perceptual Organization with the Center-Surround Algorithm

$500,000FY2003CSENSF

Purdue University, West Lafayette IN

Investigators

Jeffrey M Siskindcontact Charles A Bouman Ilya Pollak

Abstract

Robotics and Computer Vision Program ABSTRACT Proposal #: 0329156 Title: Hierarchal Perceptual Organization with the Center-Surround Algorithm PI: Siskind, Jeffery Purdue University The objective of this research is to develop a general analytic and algorithmic framework for multidimensional context-free grammars (PCFG) that can be used to model the hierarchical structure of images and other multidimensional data sets. This framework extends the notions of PCFGs from 1D word strings to 2D image data and similarly extends the inside-outside algorithm to support training, classification, and parsing on 2D image data with these extended PCFGs. The extended framework is called spatial random trees (SRTs) and the extended algorithm the center-surround algorithm. The framework is both sound and efficient because of a novel notion of constituency that constrains the allowable ways to partition a parent segment into child subsegments during parsing. This research has great intellectual merit because it forms a fundamental basis for: Inferring semantically meaningful hierarchal structure from low-level image properties such as edge saliency and region shape, color, texture, and relative position. Discovering the common hierarchal structure shared by a collection of natural images in an unsupervised fashion from unlabeled training data. Distinguishing between different natural image-scene classes on the basis of global hierarchal structure, rather than local low-level features. This research will achieve broad impact by addressing a problem that is shared among a wide array of applications in a variety of technical fields. In particular, we will: Extend the SRT framework so that it can be used to accurately model the geometric relations between constituents in hierarchal structures. This will enhance the value of SRTs in high-level modeling of images. Develop tools for combining SRT models and merging SRT models with other available data models. This will provide a general framework for both improved speed and accuracy of the methods. Explore the use of SRTs as a distance metric for classifying high-dimensional data. This opens the techniques to potential applications such as Web clustering. Develop a unified approach to combined spatial and temporal parsing of video. These new methods can support both video indexing and surveillance tasks. Develop novel approaches for the parsing and recognition of images. This can be useful in applications such as the analysis of printed information, the monitoring of surveillance video, or the analysis of medical imagery. The research team includes researchers who bring to our project, expertise from a wide variety of different fields including computational linguistics, machine vision, inverse problems, stochastic processes, and natural language processing. This broad background allows the team to collectively leverage ideas from multiple fields and have broad impact on these fields in a way that would not be possible without such collaboration.

View original record on NSF Award Search →