ITR/PE: AVENUE: Adaptable Voice Translation for Minority Languages
Carnegie Mellon University, Pittsburgh PA
Investigators
Abstract
Our primary research goal is to develop a prototype voice-enabled translating communicator which will deliver information services across the linguistic divide for minority languages in order allow remote linguistically-diverse users to communicate directly with Internet content and databases, and more importantly to communicate with others speaking a different language from their own. The latter will enable information, education, and, for example, health services, to reach remote minority-language communities. Achieving this goal requires major advances in machine learning for translation and in cross-language speech-recognition adaptability to wider language phenomena. Traditional transfer-rule-based MT requires up to a person-century to build and perfect a new language pair. Statistical and Example-Based MT replaces human coding effort by vast amounts of bilingual training data, which are virtually unobtainable for most minority languages. Without a radical advance, leading to an over-an-order-of-magnitude improvement in development time, the only commercially justifiable MT applications involve the major European languages, Japanese, Chinese, Korean, Arabic and perhaps a couple more relatively-popular languages. The vast majority of human languages are currently relegated to the proverbial MT dust heap. We propose new MT approaches based on extended and new machine learning methods. The first approach consists of statistical MT methods that learn from orders of magnitude less training data, and that can more effectively incorporate prior linguistic information (including dictionaries, word classes, and known linguistic rule classes or constraints) by using the joint source-channel modeling approach combined with exponential (maximum entropy) models. The second approach is a new method for acquiring high-quality MT transfer rules from native informants which decreases dependence on human experts and reduces development time. Semantically-conditioned transfer rules are generalized via a new locally-constrained Seeded Version-Space method based on a controlled bilingual corpus and interactive tools to elicit information from native informants. The third method builds general phone models across multiple language families for speech recognition and adapts the recognizer to new languages with minimal new- language training data. All of these methods are based on new and existing machine learning algorithms that combine prior knowledge with limited amounts of new data in order to converge quickly on working machine translation and speech recognition and synthesis systems. The primary societal impact will be a significant contribution to the global democratization of informa- tion, a process that requires bridging current linguistic barriers, especially for low-density or economically- disadvantaged languages. Additionally, preservation and teaching of endangered languages will be directly enabled by the new linguistic and acoustic knowledge coupled with existing tutorial software. If successful, Avenue (Adaptable Voice-Enabled Natural-translator for Universal Empowerment) will be the prototype of an MT system that will empower world-wide access to multilingual information.
View original record on NSF Award Search →