GGrantIndex
← Search

CAREER: The Grammar Matrix: Computational Linguistic Typology

$474,229FY2007SBENSF

University Of Washington, Seattle WA

Investigators

Abstract

Linguists investigate the structure of natural languages by building models called grammars which associate strings of words with their semantic representations. As natural languages are quite complex in their structure, there are great benefits to building these models as computer software, both in managing the complex interactions of subparts of the model and in automating the testing of the model against large collections of natural language text. However, such computerized models are expensive and difficult to build, so most linguistic hypothesis testing takes place off-line. At the same time, there are great similarities among languages, suggesting that existing work on computerized models for a few well-studied languages can be leveraged to jump start the creation of models for other languages. With this CAREER award, Dr. Emily Bender aims to develop a grammar customization system that combines a cross-linguistic core grammar with a series of 'libraries' specifying alternate ways of realizing different linguistic subsystems that vary across languages (e.g., the expression of tense and aspect, coordination, or negation). This system will allow linguists to easily customize a small working grammar for a particular language, which they can then use as a test-bed for further linguistic investigation. The process of building the system itself will involve an unprecedented exploration of computational linguistic typology, exploring precise analyses of diverse linguistic phenomena across a sample of languages which show the boundaries of variation. The broader impacts of this project include the potential for quicker, more precise documentation of endangered languages and cheaper development of natural language technology (including machine translation, grammar checkers and computer-assisted language learning programs) in under-resourced languages. This project is cofunded by the Linguistics Program in the Behavioral & Cognitive Sciences Division of the Social, Behavioral & Economic Sciences Directorate and the Robust Intelligence Cluster in the Information & Intelligent Systems Division of the Computer & Information Science & Engineering Directorate.

View original record on NSF Award Search →