Geometric and Semantic Structures for Two- and Three-Dimensional Shape Understanding

$160,000FY2020MPSNSF

Occidental College, Los Angeles CA

Investigators

Abstract

True computer vision will provide end-to-end image analysis, where images are decomposed into objects of interest, those objects are decomposed into parts, and the parts and objects are recognized. Performing integrated tasks with an image, such as shape generation, animation, editing, or partial matching, requires structure-aware shape processing. A full shape structure consists of a decomposition into parts, an understanding of which parts are more significant than others, and an ability to measure similarity of parts moving toward recognition. A pipeline that takes as input two- or three-dimensional images, performs accurate segmentation to determine shapes of interest, extracts a shape structure, then recognizes the parts and the shapes would represent a fundamental step forward in artificial vision. The task is challenging because human visual perception does not follow computational rules. For example, two shapes can both be similar to a third shape without being similar to each other. For another, our understanding of meaning of shapes adds a semantic level to our geometric perception: if someone is seated on an object, we classify that object as a chair regardless of its shape. Any useful shape analysis must explicitly model the interplay between semantics and geometric shape. This project aims to develop the foundational theory of shape structure and provide robust implementations of the resulting techniques while maintaining the connection to human semantic perception through benchmarking to user studies. The Blum medial axis gives a skeletal decomposition of a closed region in Euclidean space. For spatial dimensions 2 and 3, these regions can be interpreted as 2D and 3D shapes, with the skeletal model providing a lower-dimensional representation of the shape. The skeleton, a Whitney stratified set, is a deformation retract of the shape boundary that captures complete geometric information about the boundary of the shape. This project will introduce functions on the medial axis that encode shape geometry in a way that allows for the determination of a parts decomposition and hierarchy within a shape, as well as similarity between parts, for shapes of any finite genus. Based on that analysis, the research will develop formal measures of shape complexity and benchmark results through human perception studies. Finally, the project aims to connect the new shape structure characterization to current approaches using neural networks for image understanding by developing network architectures that learn the geometry of a shape skeleton from its natural or binary image representation. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →