RI: Small: Uncovering Dynamics from Internet Imagery

$450,000FY2018CSENSF

University Of North Carolina At Chapel Hill, Chapel Hill NC

Investigators

Abstract

Virtual- and augmented reality (VR/AR) technologies have the promise to enable new and exciting ways of perceiving the world from the comfort of our homes and desks. Among the current applications of 3D VR visualizations, obtaining realistic depictions of actual real-world environments is highly desired in educational experiences. This project will develop scalable algorithms for computing "living 3D models" that can represent elements such as people and cars moving around the scene, water flowing in fountains, or chairs outside cafes being placed in different places on different days. To overcome the need for dedicated capture, the project targets publicly available Internet photo collections, which have the requisite data diversity to drive large-scale, cost-effective VR/AR content generation. The research not only supports the field of VR/AR but also provides improved analysis methods for a broad range of other applications, including forensic analysis, cultural heritage conversation, city planning, virtual training, and education, with particularly potential impact in enhancing social study experiences for economically disadvantaged students. This project will aggregate object instances in the individual 2D images of the photo collection to infer the motion dynamics of the entire class of objects in the scene, e.g., all cars or all people. The method will thus infer and model the motion dynamics without ever seeing the motion of these objects, since there is typically only one observation per object instance available due to the uncontrolled, crowd-sourced capture. The key information for the inference will be the observation of the varying densities of the dynamic scene elements in the scene. The novel scene representation stores the accumulated dynamics in object class scene occupancy maps, as well as object class motion flows for the scene, e.g., the information where pedestrians move to in the scene and how they move within the scene. The developed methodology will open new and exciting avenues for research on jointly recovering semantic labels and 3D geometry in the wild, a task that is one of the currently most challenging problems in 3D computer vision. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →