CAREER: Holistic Scene Understanding with Multiple Hypotheses from Vision Modules

$435,122FY2017CSENSF

Georgia Tech Research Corporation, Atlanta GA

Investigators

Abstract

This project develops algorithms and techniques for holistic scene understanding from images. The key barrier to building the next generation of vision systems is ambiguity. For example, a patch from an image may look like a face but may simply be an incidental arrangement of tree branches and shadows. Thus, a vision module operating in isolation often produces nonsensical results, such as hallucinating faces floating in thin air. This project develops a visual system that jointly reasons about multiple plausible hypotheses from different vision modules such as 3D scene layout, object layout, and pose estimation. The developed technologies have the potential to improve vision systems and make fundamental impact - from self-driving cars bringing mobility to the physically impaired, to unmanned aircrafts helping law enforcement with search and rescue in disasters. The project involves research tightly integrated with education and outreach to train the next generation of young scientists and researchers. This research addresses the fundamental challenge in joint reasoning by extracting and leveraging a small set of diverse plausible hypotheses or guesses from computer vision modules (e.g. a patch may be a {sky or a vertical surface} x {face or tree branches}). This project generates new knowledge and techniques for (1) generating a small set of diverse plausible hypotheses from different vision modules, (2) joint reasoning over all modules to pick a single hypothesis from each module, and (3) reducing human annotation effort by actively soliciting user feedback only on the small set of plausible hypotheses. Project Webpage: http://computing.ece.vt.edu/~dbatra

View original record on NSF Award Search →