EAGER: Deep Architectures for Ppredicting 3D Object Motion

$175,406FY2019CSENSF

University Of Massachusetts Amherst, Amherst MA

Investigators

Abstract

Our everyday living environments are populated with lots of functional objects with which we can interact through their moving parts (e.g., swivel chairs, laptops, bikes and cars, to name just a few). For autonomous agents to correctly interact with these objects in real-world settings, the agents must be equipped with algorithms that are able to parse the objects into their moving parts. But that is not enough. Through the widespread use of commodity 3D sensors and modern 3D modeling techniques, large repositories (such as ShapeNet) containing millions of digital representations of everyday objects are now available, but these representations are for the most part currently static, that is to say they represent single snapshots of objects. To make use of these object representations in dynamic, virtual environments and in animation applications, methods that automatically segment them into moving parts and synthesize plausible motions for them are needed. This project will explore the design, implementation, and testing of new deep learning architectures that accomplish this, and thereby bring large portions of static 3D datasets "to life." The new algorithms will have broad industrial impact by advancing 3D modeling and animation software, while the generated motion data will be useful for training new computer vision algorithms for object motion recognition and tracking in videos. Achieving the project goals will require development of new algorithms to convert static digital representations of 3D objects into dynamic ones by automatically recognizing their moving parts and animating them based on input reference videos of similar objects from the real world and through incorporation of novel methods for estimating partial 2D-3D correspondences, for lifting 2D motion cues to 3D, and for inferring motion rigs for 3D shapes. The project will be organized into two main thrusts, each of which will present its own research challenges. The first thrust will investigate new deep learning architectures for performing mobility-based segmentation of 3D objects and predicting the underlying motion of their parts. The architecture will be applied to man-made objects with rigidly moving parts. This part of the research will be carried out in the first year of the project. The second thrust will extend the previous work to animate 3D models representing living organisms such as quadrupeds, birds and fish (i.e., animals from the DigitalLife dataset). These models undergo non-rigid deformations, so the architecture will have to be modified to estimate and control more sophisticated deformation primitives. This part of the work will be executed in the second year of the project. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →