Five Languages of Eurasia: Field Work, Analysis and Digital Archiving
Colgate University, Hamilton NY
Investigators
Abstract
This project, which is funded through the NSF Documenting Endangered Languages program and the Arctic Social Sciences program, led by Dr. Alexander Nakhimovsky of Colgate Univiersity and a group of Russian linguists led by Dr. Alexander Kibrik of Moscow University will conduct a three-year project that will focuse on two areas: (1) field work on five languages: Archi and Khinalugh in the Caucasus, Nganasan and Enets on the Taimyr peninsula, Alutor on Kamchatka; and (2) development of tools and infrastructure for creating language archives. Work in the first area will be done by Russian scholars who are top experts on the languages involved but lack technology and expertise to create language archives in accordance with "best practices." Work in the second area will be done by the PI, a PhD in linguistics who has been teaching computer science since 1985 and has taught and extensively published on XML and Semantic Web technologies on which these "best practices" are based. Participants of the project will: - conduct extensive fieldwork to record digital audio and video materials in all languages; - revise, digitize, and document in accordance with best practices, as well as provide wider access to previously collected materials, going back to the 1970s; - devise a writing system for two languages, and educational materials for one of them; - for three languages, create a phonetic database containing sound samples at different levels, from syllable to word to sentence to narrative, with videos showing articulation; - format primary materials in accordance with best practices for digital language archives and make them available, as dynamic OLAC repositories, in consistent, archivable, interoperable, and Web-based formats; - further develop the infrastructure and software tools for "best practices" in creating language archives, including PI's unique software tool for annotating digital video. This proposal will leverage the expertise and ongoing research of top-notch scholars who will contribute, beyond the boundaries of this proposal, a great deal of labor- and expertise-intensive work on glossing and describing the data. They will work to document languages that are either endangered or on the brink of extinction, in areas that are either extremely remote or in political turmoil, in communities where they already have networks of informants who know and trust them. In addition, the proposal will produce software tools both for conducting fieldwork and for archiving its results on the Web. Some of them will be quite general tools for collecting metadata from domain experts. When completed, the proposal will benefit several communities. First, some of the communities of native speakers involved in the project will receive a writing system and educational materials for children. Second, the community of Russian linguists, many of them highly trained and motivated people who have been working on the many endangered languages of the Russian federation for years and decades, will receive a powerful impetus to adopt better archival practices. Such documentation of their work that exists now is mostly on paper or in proprietary formats, in Russian, with audiotapes missing or of poor quality. If their documentation standards improve, linguists and anthropologists worldwide will get access to vast linguistic knowledge properly archived. Third, the community of linguists everywhere, not only those working on endangered languages, will receive powerful new tools, some of them bundled into SIL's popular toolbox for fieldwork. Finally, the project will benefit computer scientists and technologists who are building the Semantic Web: a web of well-documented and well-indexed information that can support complex search, navigation and inference.
View original record on NSF Award Search →