CAREER: Visual Learning in an Open and Continual World

$434,228FY2023CSENSF

Georgia Tech Research Corporation, Atlanta GA

Investigators

Abstract

The field of computer vision has seen significant progress in the past decade: These models are now able to efficiently process complex images and automatically extract information, such as detecting what type of objects exist in the image and where they are located. However, current methods require a pre-specified list of object categories that are in the images. This requirement that is unrealistic when systems are deployed in real-world contexts, such as on self-driving cars or large photo collections. If new types of objects appear, current systems will need to have humans identify the new objects and annotate the images and then retrain the computer vision model through a process that takes significant computational resources. Unlike humans, the system cannot automatically understand when new types of objects are in the images, how they relate to objects that the system already knows about, and how to continually update its knowledge given little to no human annotation. This project therefore seeks to enable a computer vision system that can continuously and automatically detect and discover new categories, as well as update its model, with little to no human annotation. Such a capability would have implications in a range of applications, including personalized analysis of photo collections, home robotics, self-driving cars, and medical imaging, where novel unknown objects often lead to misleading or incorrect object detection. The project will address this through a range of research innovations as well as through several outreach activities, including democratizing AI education by working with educators from K-12 and up to teach our open-source course materials. Towards this end, the goal of this project is to create a framework for an open-world and continual learning system that develops principled methods for naturally understanding and handling different types of distribution shifts, as well as incrementally discovering and learning new categories as they appear in unlabeled data, and placing them within a rich semantic hierarchical structure. This will be accomplished by first detecting different types of distribution shift that can occur (e.g., changes in appearance due to weather or existence of entirely new objects) and developing principled out-of-distribution detection and calibration methods to disentangle them. These methods will be used to understand how they affect the model's predictions. Subsequently, rather than just detecting whether new categories exist and throwing the resulting data out, this fine-grained understanding of distribution shift will support incrementally updating the model in response. This will be done by developing methods to build long-term representations and classifiers that discover new categories and place them within a rich hierarchical semantic structure. Finally, semi-supervised continual learning will be leveraged to incrementally refine the representations and automatically learn classification and detection models, using a mixture of labeled and unlabeled data appearing at different times, while avoiding catastrophic forgetting. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →