CAREER: Bootstrapping Recognition from Little Data in New Domains

$485,806FY2022CSENSF

Cornell University, Ithaca NY

Investigators

Abstract

This award is funded in part under the American Rescue Plan Act of 2021 (Public Law 117-2). With current computer vision technology, one can easily build software that can recognize thousands of objects, be they cats, cars or birds. However, such systems must be trained on large datasets of millions of images which have been painstakingly labeled by human annotators. Such large, labeled datasets are difficult to create for many application domains, such as microscopy, where annotators will need to be highly trained. Collecting large datasets may also run afoul of privacy concerns, and may need expensive curation to remove racial, gender or other kinds of bias. Finally, the steep cost of collecting large datasets makes computer vision technology inaccessible to many. This project develops technologies for recognition systems that can successfully identify difficult visual concepts without needing any large datasets. Recognition systems that can work with limited training data will unlock many downstream applications, especially in specialized domains, and will make advances in computer vision technology accessible to all. The project team will make these recognition systems broadly accessible to all, organize a workshop for high school students, and develop a new computer vision curriculum that focuses on broad applications of computer vision technology. To build recognition systems from little data (as few as 1-2 labeled images and around 1000 unlabeled images), this project explores two strategies inspired from human vision. First, unlike current recognition systems that are trained in isolation for each problem domain, humans learn to perform new visual tasks (such as analyzing microscopy images) in the context of their vast prior visual experience. This project similarly designs visual learners that learn new recognition tasks from limited data by leveraging a memory of past visual tasks across multiple domains. Second, unlike current systems that learn only from labeled images, humans learn through rich interactions and back-and-forth with expert teachers. Inspired by this observation, this project builds systems that can learn by asking detailed questions of experts. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →