Parsed and Audio-Aligned Corpus of Bilingual Russian Child Speech (BiRCh)
Brandeis University, Waltham MA
Investigators
Abstract
Language, thought, and culture are intricately connected, but the way this relationship plays out over the lifespan of an individual, especially a bilingual person, is not well understood. Bilinguals constitute a large portion of the US population and around the world. This project researches the language of bilingual speakers in Russian immigrant communities to gain a better understanding of fundamental properties of linguistic knowledge, language acquisition and maintenance, the nature of language variation and change, and stability of native speaker knowledge. Studying the language of immigrants is also important, because it will help build understanding and respect for these often stigmatized linguistic practices. The large, open-access database of bilingual and monolingual Russian speech created during the project will allow education policy makers and practitioners to make appropriate decisions concerning bilingual children in American schools and to create educational resources for heritage speakers of Russian who are an invaluable language resource for the country. Ultimately, this database can also help natural language processing applications for Russian, such as enhancing opportunities for cross-cultural communication online, and making new and less available publications quickly accessible through summarization and machine translation. The project will construct an open-access online database documenting the speech of two types of bilinguals: émigré adults and young bilingual children in Russian-speaking families in the US and Germany, with a control group of monolingual families with small children in Russia. This first-of-its-kind database will serve as a tool for comparing linguistic behavior across populations and over time, investigating correlations between grammatical, lexical, and sociolinguistic variables. It will contain audio-aligned transcripts and will be annotated for morphology (e.g., "feminine noun") and syntax (e.g., "relative clause"), which will allow researchers to study frequencies of constructions in both the parents' and children's speech. The database will enable researchers to tease apart several possible causes of the differences between the home language of bilingual children and adults, and the speech of monolinguals: normal processes of language change or the influence of the majority language and culture; incomplete learning or forgetting of the home language; or universal cognitive and linguistic principles.
View original record on NSF Award Search →