Investigating inductive biases for language acquisition with meta-learning
Mccoy, Richard T, Baltimore MD
Investigators
Abstract
This award was provided as part of NSF's Social, Behavioral and Economic Sciences (SBE) Postdoctoral Research Fellowships (SPRF) program and SBE's Linguistics program. The goal of the SPRF program is to prepare promising, early career doctoral-level scientists for scientific careers in academia, industry or private sector, and government. SPRF awards involve two years of training under the sponsorship of established scientists and encourage Postdoctoral Fellows to perform independent research. NSF seeks to promote the participation of scientists from all segments of the scientific community, including those from underrepresented groups, in its research programs and activities; the postdoctoral period is considered to be an important level of professional development in attaining this goal. Each Postdoctoral Fellow must address important scientific questions that advance their respective disciplinary fields. Under the sponsorship of Dr. Thomas L. Griffiths at Princeton University, this postdoctoral fellowship award supports an early career scientist investigating how children can acquire language so rapidly and how we can create artificial intelligence systems with similarly impressive learning abilities. Language acquisition involves a complex interplay between the data and the learner. Suppose a learner has been told that a green triangle is an example of a “dax.” A learner preferring shape-based generalizations would conclude that “dax” means “triangle,” while a learner preferring color-based generalizations would conclude that “dax” means “green object.” The properties of the learner that guide generalization are called inductive biases. It is not clear which inductive biases enable people to acquire languages so rapidly. The proposed research will investigate this topic using computational modeling. By creating different computational systems that have different inductive biases, we will test which inductive biases yield the most human-like learning, allowing us to determine what factors guide language acquisition in humans. In addition to improving our understanding of how humans learn, this research will also have implications for artificial intelligence (AI). AI plays an increasingly large role in public and private life, but current AI systems replicate harmful prejudices from their training data. Our approach provides a way to control the inductive biases of AI systems, which could be used to discourage harmful generalization patterns that current models display. This project presents a novel computational approach for modeling inductive biases and how they interact with data, and it then uses this approach in classic settings where inductive biases are debated. Our framework is based on meta-learning, a machine learning technique in which a model is shown a variety of tasks from which it automatically finds an inductive bias that enables it to learn new tasks more easily. In our application of meta-learning, we control the space of tasks, thereby controlling the inductive bias that is imparted via meta-learning. We use this approach to create models which instantiate a range of hypothesized inductive biases so that we can analyze which biases best explain human learning behavior. We evaluate this approach in a series of case studies in language acquisition, focusing on the acquisition of question formation in English and subject-verb agreement across languages. For each case study, we analyze inductive biases ranging in specificity from a general-purpose preference for simplicity to detailed innate knowledge of the structure of language, motivated by longstanding questions about whether children need extensive innate knowledge to acquire language. We define the inductive biases of interest using probabilistic models and then use meta-learning to instantiate those biases in neural networks. Our approach therefore combines the complementary strengths of two major approaches to computational cognitive science: the controllability of probabilistic models and the learning power of neural networks. All of our experiments involve the creation of useful resources (code, datasets, and trained models) that we will make publicly available to facilitate further research into language learning. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →