CAREER: Brain-inspired Methods for Continual Learning of Large-scale Vision and Language Tasks
University Of Rochester, Rochester NY
Investigators
Abstract
The goal of this research project is to create deep neural networks that excel in a broad set of circumstances, are capable of learning from new data over time, and are robust to dataset bias. Deep neural networks can now perform some tasks as well as humans, such as identifying faces, recognizing objects, and other perception tasks. However, existing approaches have limitations, including the inability to effectively learn over time when data is structured without forgetting past information, learning slowly by looping over data many times, and amplification of pre-existing dataset bias which results in erroneous predictions for groups with less data. To overcome these problems, this research project aims to incorporate memory consolidation processes inspired by the mammalian memory system that occur both when animals are awake and asleep. The new methods developed in this project could lead to machine learning systems that 1) are more power efficient, 2) can learn on low-powered mobile devices and robots, and 3) can overcome bias in datasets. In addition, a significant educational component involves training the next generation of scientists and engineers in deploying machine learning systems that are safe, reliable, and well tested via new courses and programs. In greater technical detail, this project will develop new measures for neural networks to 1) test for biases, 2) assess the acquisition of robust concepts, and 3) study forward transfer in neural networks trained over time. New brain-inspired algorithms are proposed that learn online but then have downtime periods in which they engage in greater levels of memory consolidation, which are informed by findings in neuroscience for the neural activities that occur during the wake-sleep cycles of humans and other mammals. The proposed algorithms are based on the brain's complementary learning systems for memory formation, storage, and retrieval. The models are evaluated on large-scale incremental image classification tasks as well as tasks involving multi-modal scene understanding and abstract reasoning. This research will provide building blocks that others can use to create new algorithms and applications. All code and datasets will be made publicly available. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →