Doctoral Dissertation Research: Inferring the linguistic past of an indigenous language family of North America

$20,206FY2022SBENSF

University Of California-Berkeley, Berkeley CA

Investigators

Abstract

Languages evolve in similar ways to biological species. Recent linguistic work has used computational phylogenetic methods from the biological sciences to address questions about the linguistic past: Is French more closely related to Italian or to Spanish? When did the Germanic languages, such as German and English, begin to diverge from their common ancestor? The former is an example of inferring genetic relationships within a language family tree and the latter is an example of estimating the dates when genetically related groups diverged. There is still much to learn about the use of these methods in linguistics and how much confidence can be placed on the results they produce. To this end, this study employs computational phylogenetic methods to infer the genetic relationships and divergence dates of an Indigenous language family in North America and examines how the results differ from those of previous studies. This work also attempts to understand why these differences occur and what new light the results shed on the linguistic history of the language family. Data used in linguistic phylogenetic studies to address similar research questions have varied to a considerable degree. Therefore, it is often unclear which data type to use or how the data should be coded, and how these decisions affect the results. Additionally, different types of linguistic data may exhibit different evolutionary dynamics and using the same evolutionary model may not be appropriate for all kinds of data. Therefore, this study examines how different choices of (a) data type, such as lexical and grammatical data, (b) coding scheme, and (c) evolutionary model influence the results. To ensure that the data is coded accurately and consistently, this study involves philological, archival, and field research. As some languages are no longer spoken and contain ‘missing’ data, this study also employs the same methods for inferring the linguistic past to predict and fill in missing values in the data which can then be helpful for language revitalization efforts. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →