EAGER: Visual Saliency with Discriminancy, Sparsity and Connectivity

$135,742FY2012CSENSF

University Of California - Merced, Merced CA

Investigators

Abstract

This project explores new directions to solving top-down modulated visual saliency maps with three basic principles: discriminancy, sparsity and connectivity. The research identifies key factors for advancing the state-of-the-art and presents a novel latent variable model, which extends the classical conditional random field with an embedded layer of latent variables to exploit the sparsity nature of features for saliency maps. This sparse latent variable conditional random filed model can be considered as a joint optimization of group sparse coding and conditional random field, which can be solved with an efficient stochastic gradient descent algorithm. Unlike bottom-up saliency, this model facilities high-level visual recognition tasks by learning sparse image structures from objects of interest. The key intellectual contributions of this project are a novel formulation that considers all three important properties for visual saliency in a unified framework, and an efficient learning algorithm to estimate the model parameters. With the developed techniques, the search regions of these vision tasks can be constrained and thereby reduce the computational complexity and enhancing robustness. Effective top-down modulated visual saliency algorithms have broad applications including object detection, object recognition, visual tracking, scene analysis, image compression, surveillance, and robotics. It also provides a crucial tool for studying and analyzing fixations of eye movements in cognitive science. The research results including code and data are made public on the project web site.

View original record on NSF Award Search →