CAREER: Semantics for Statistical Machine Translation

$513,158FY2006CSENSF

University Of Rochester, Rochester NY

Investigators

Abstract

The past few years have seen a revolution in machine translation, with the widespread adoption of statistical systems trained on large amounts of parallel bilingual text. Recent evaluations have shown that current statistically trained research technology significantly outperforms commercially available MT systems such as those available on the web. But even state-of-the-art systems produce garbled translations more often than not. Further improvements in machine translation will require major changes in the architecture of statistical systems. Our research aims to improve the quality of machine translation output by allowing statistical systems to handle deeper, semantic representations. Our approach focuses on improving statistical machine translation by using a semantic representation at the level of predicate-argument structure. This work builds on the recent success in statistical approaches to shallow language understanding, and tree-based algorithms for machine translation using syntactic parses of the source and target sentences. Over the course of the project we aim to: first, develop robust semantic parsing systems capable of generalizing to new domains and apply them to large bilingual corpora, second, develop probabilistic models of translation that use the resulting level of representation and can be practically trained, and third, integrate language understanding and translation to allow efficient search for the best overall translation of new sentences.

View original record on NSF Award Search →