CAREER: A Unifying Stochastic Framework for Temporally Consistent Computer Vision Models

$488,754FY2022CSENSF

University Of Florida, Gainesville FL

Investigators

Abstract

Sequential Monte Carlo methods are an effective mechanism to integrate observations over time in computer vision problems, especially in association with features generated by deep neural networks. However, it is still unclear how such networks can be interpreted as components of a stochastic inference system. This project will combine sequential Monte Carlo methods with neural networks to create a trainable stochastic framework for computer vision tasks. The developed framework will enable the design of autonomous and robotic systems that can interpret and interact with their environment, a critical component for the automation of tasks that must be performed in complex, unconstrained scenarios. These capabilities will be demonstrated through the development of novel agricultural robotic systems that generate accurate models of agricultural crops at varying levels of spatial and temporal granularity. With particular focus on under-represented populations, the project will provide research opportunities and hands-on training to graduate and undergraduate students on artificial intelligence topics and their applications to agricultural problems. It will also provide the students with foundational entrepreneurial skills that will allow them to identify problems of broad societal relevance that can be solved using machine learning and artificial intelligence methods. This research will create a stochastic framework that learns in an end-to-end manner how to leverage semantic information about objects of interest to assimilate spatial and temporal visual information and enforce temporal consistency in computer vision algorithms. Casting the multiple-object segmentation and tracking problem as a non-parametric pixel probability distribution estimation task will make it possible to devise uncertainty-aware models that learn how the appearance of objects varies over time given the context surrounding them. These research efforts will also introduce a new paradigm for the representation of motion models that enforce temporal consistency among video frames at the pixel level, obviating the need for object detection and localization techniques. Finally, these temporal association methods will substantially simplify the problem of recognizing and reconstructing complex objects in unstructured environments. By incorporating parameters of relevance to agricultural problems and extending the probabilistic models to satisfy domain-specific constraints, this project will devise novel techniques to extract semantic information and generate large-scale reconstructions of entire orchards at the granularity of individual leaves. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →