GGrantIndex
← Search

XMELLT: Cross-lingual Multi-word Expression Lexicons for Language Technology

$99,322FY2000CSENSF

Vassar College, Poughkeepsie NY

Investigators

Abstract

The profound effect that information technology is having on the global society and economy dictates the development of means to enable global access to the huge volumes of currently disorganized and relatively unstructured materials that are now available to all members of society for business and pleasure, teaching and scholarship. This project aims to define the dimensions of a core international infrastructure that can support the creation of a multi-lingual multi-word expression lexicon incorporating both morpho-syntactic and semantic information, and which will provide a base for building pivotal natural language applications aiming at management of, and universal access to, the vast quantity of information that is becoming available each day via the World Wide Web. In particular, the PI's aims are: To establish uniform standards for describing multi-word lexical entries at the levels of syntax and morpho-syntax and lexical semantics; To identify realistic objectives for the representation of phrasal expressions in a multi-lingual lexicon, in terms of size, scope, existing usable input, additional input and links to other lexical resources, and information types still to be acquired; To determine the type and dimensions of the information that will best serve the needs of critical NLP applications; To specify an overall architecture for a joint software and "lingware" development project; To specify the outline of a collaborative project to acquire and represent multi-word lexical entries for multiple languages; To explore the feasibility and dimensions of the eventual project by creating a small number of multi-word entries for support verbs, and including syntax and morpho-syntax; And to explore the possibilities of recognizing and acquiring a repertory of multi-word lexical units from corpora by means of partial parsing, statistics, etc. This research will lay the ground for development of a multi-lingual lexicon for multi-word expressions, an important and essential resource to support the future IT environment. In this way, the work will contribute directly to the creation of a universally accessible, multi-lingual and multi-cultural information infrastructure for individuals and organizations across the globe.

View original record on NSF Award Search →