EAGER: Exploring Cognitively Plausible Computational Models for Processing Human Language
University Of Massachusetts Lowell, Lowell MA
Investigators
Abstract
Despite the recent successes of artificial intelligence techniques designed to process human language, most contemporary solutions are designed to handle very specific language processing tasks. As a result, human-level language understanding is still out of reach for most current computational approaches, especially when retaining new information and reasoning over the accumulated knowledge is involved. This exploratory project advances the goal of developing more cognitively realistic computational models that can mimic some of the known properties of human language processing, and as a result, be more robust and better suited as general systems for language understanding, with human-like learning which involves obtaining and updating knowledge over time. While most contemporary deep learning approaches in natural language processing focus on task-specific end-to-end models, this project prioritizes generalist architectures that would be consistent with the current data on semantic priming, grouping and chunking effects in the formation and use of conceptual systems, and the effects of long- and short-term memory on the storage and retrieval of knowledge. In this project, novel neural network architectures are planned that model a subset of these properties. The processes that enable learning and memory via strengthening of synaptic connections in the brain will be emulated by a set of representational units (r-units) with bidirectional connections, modeling the interaction between small regions of neocortex during information processing. Memory Store Activation State Model represents the connections between r-units in terms of convolutional filters applied to the memory store. The priming effects will be modeled by a pre-activation pattern produced via a sequence of deconvolutional operation. Rate-Based Connectivity Network model combines reinforcement learning on per-node basis with a form of Hebbian learning applied to a time-varying system where each r-unit calculates rate of change of its output, allowing node activations to linger through time; it is trained with a discrete global reward signal. The goal of this project is to establish the feasibility of the proposed architectures by developing the initial proof-of-concept prototypes, demonstrating that they are able to converge on simple learning tasks, and applying them to the task of language modeling to ensure that a practically useful representation can be learned. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →