Wordbank: An Open Repository for Developmental Vocabulary Data

$502,087FY2015SBENSF

Stanford University, Stanford CA

Investigators

Abstract

Learning language is one of the most impressive and intriguing human accomplishments. Early language skills set the stage for later cognitive development and academic achievement. The goal of this project is to develop a powerful tool for researchers interested in typical and atypical language development to better understand young children's earliest language. This tool, called Wordbank, is a structured database of parent reports about children's vocabulary that combines tens of thousands of reports completed by parents whose children have participated in child development research. Wordbank will include data from research laboratories in dozens of countries, collected over many years and including many of the world's languages. This database will be useful for understanding generalizable trends across languages and cultures as well as exploring reasons that individual children might differ in their language development. Such a rich source of information will allow for novel insights that could not be discovered in smaller samples. Wordbank will make use of the MacArthur-Bates Communicative Development Inventories (CDIs), a widely-used family of parent-report instruments that are designed for easy and inexpensive data-gathering about children's early language acquisition. Wordbank will archive CDI data across languages and labs in an item-by-child format relational database. Built on open-source analytic tools, the site will host in-depth exploratory visualizations and facilitate the productive reuse of data. In addition to an interactive interface for exploration, the website will also allow researchers to connect directly to the underlying database. The result will be a resource that enables new discoveries about early language across a variety of disciplines.

View original record on NSF Award Search →