GGrantIndex
← Search

EAGER: Phylo: Phylogenetic Reconstruction of Textual Histories

$75,000FY2010CSENSF

University Of Southern California, Los Angeles CA

Investigators

Abstract

This project, supported by an EArly-concept Grant for Exploratory Research (EAGER), is developing computational models of how manuscripts of premodern texts changed over time due to copying with errors, intentional editing, and translation into different languages. The purpose of these models is to reconstruct the original texts and to better understand the forces that shaped them. We are building on work applying ideas from computational evolutionary biology to the task, but the main focus of the project is to explore whether cutting-edge ideas from computational linguistics and natural language processing are better suited for modeling the evolution of natural-language texts. In particular we are exploring the use of techniques from nonprojective dependency parsing to model the tree of relationships among manuscripts and statistical machine translation to model the relationship between pairs of manuscripts. The tools that result from the project will be made publicly available in order to foster cross-disciplinary research. These tools will enable scholars of ancient and medieval literature to use our models to analyze collections of manuscripts that may not have been possible to analyze by hand before. The techniques explored will shed light on computationally hard learning and search problems such as those that frequently arise in natural language processing.

View original record on NSF Award Search →