Bayesian computations in human 3D visual perception

$304,920R01FY2010EYNIH

University Of Rochester, Rochester NY

Investigators

Linked publications & trials

Paper 26024461 Paper 24171930 Paper 23440185 Paper 21742982 Paper 21724567 Paper 20504749 Paper 20465321 Paper 19548796

Abstract

DESCRIPTION (provided by applicant): The goal of the proposed research is to understand how the human visual system resolves the inherent geometric ambiguities associated with most visual cues to depth. The brain can resolve cue ambiguity in two ways, (1) by applying prior knowledge of ecological constraints on those variables (e.g. that figures tend to be symmetric) and (2) by cooperatively using the information from other sensory cues to disambiguate their values. The first two principal aims focus on the first part of the problem. They are shaped by the observation that much of the statistical structure that makes monocular cues to depth informative is categorical in nature - motions are rigid or not, figures are symmetric or not, textures are homogeneous or not, etc.. We will study how the visual system combines information from multiple cues to disambiguate which of the several possible prior constraints to use when interpreting a cue. Casting the problem within a Bayesian framework provides a formal system for modeling robust cue integration, which allows the visual system to effectively deal with large conflicts between sensory cues. We will perform experiments to test the Bayesian model against other models of robust cue integration. The model also provides a framework for characterizing how the brain adapts its internal models of the prior statistics that make monocular cues informative. We will study how human observers use the information obtained by combining multiple cues to adapt these internal models and how this impacts how they integrate cues to estimate surface orientation and shape. The final principal aim tests whether and how the brain uses non-visual information (haptic / kinesthetic) derived from active movement and exploration of objects to disambiguate scene properties on which visual cues depend. The research will focus on three monocular visual cues about surface orientation and shape- figure shape, texture and motion - and how the brain combines these cues with stereoscopic cues. The psychophysics is motivated by and will be coupled with computational modeling of ideal Bayesian models for visual cue integration, learning and multi-modal cue integration. The results of the proposed research will elucidate the types of statistical inferences that are built into the neural computations underlying visual depth perception and define the limits of these computations. This will ultimately direct and constrain future studies of the neural mechanisms underlying vision.

View original record on NIH RePORTER →