GGrantIndex
← Search

US-Egypt Cooperative Research: Integrating Statistical Machine Translation from Arabic to English with Syntactic and Semantic Analysis

$25,000FY2002O/DNSF

University Of Southern California, Los Angeles CA

Investigators

Abstract

0210165 Knight Description: This award is to support a collaborative project between Dr. Kevin Knight, Information Sciences Institute (ISI), University of Southern California, Los Angeles, California and Dr. Ahmed Rafea, Computer Science Department, The American University in Cairo(AUC), Cairo, Egypt. They plan to explore a new method for automatic translation between Arabic and English. The so-called statistical machine translation method exploits both fast computers and the existence of large human-translated documents. The investigators plan to develop new models based on bilingual texts that they will manually annotate. These annotations will show how the translation process should move from Arabic into syntactic/semantic structures and from there into English. In the first stage, a substantial collection of United Nations documents translated into Arabic will be manually annotated with morphological, syntactic and semantic information by graduate students at the AUC. In the second stage, the annotated corpus will be used to train morphological analyzers and parsers of Arabic, in order to automate annotation in the second year of the project and to study the annotations of parallel Arabic and English sentences for the development of syntactic and semantic translation models. In the third and fourth stages of the proposed research, existing learning algorithms will be investigated for their suitability to integrate the developed models with statistical machine translation systems developed at ISI. Scope: Reliable automatic translation between Arabic and English may have major impact on international commerce, technology, and science, but this requires that translation quality be improved significantly. With the proposed method, computers may gather vast amounts of translation knowledge automatically from text, and apply that knowledge to translate new documents. This research will require significant expertise in computer science, Arabic and English linguistics, and statistical inference, which are available in the two collaborating groups at USC and AUC. Dr. Knight is well known for his research in the area of statistical machine translation and he will provide supervision for the research performed by graduate students at ISI and AUC. Dr. Rafea has excellent credentials in Arabic natural language processing and in conducting collaborative research with U.S. universities. The task of addressing weaknesses in current translation models will make higher translation quality possible and will make practical use of machine translation more widespread. The research will involve U.S. and Egyptian graduate students and will promote collaboration between researchers in the US and Egypt. This project is being supported under the US-Egypt Joint Fund Program, which provides grants to scientists and engineers in both countries to carry out these cooperative activities.

View original record on NSF Award Search →