CAREER: Cross-Document Cross-Lingual Event Extraction and Tracking

$543,384FY2010CSENSF

Cuny Queens College, Flushing NY

Investigators

Abstract

The goal of this research project is advance the Information Extraction (IE) paradigm beyond "slot filling", and achieve more accurate, salient, complete, concise and coherent extraction results by exploiting dynamic background knowledge and cross-document cross-lingual event ranking and tracking. The approach consists of cross-document inference, unknown implicit event time prediction and reasoning, cross-document entity coreference resolution with global contexts, centroid entity detection, event attribute extraction and graph-based clustering algorithms for redundancy and contradiction detection, automatic new event clustering and active learning, abstractive summary generation based on extraction results, name translations with comparable corpora and cross-lingual co-training. The experimental research is integrated with educational activities, including project-related curriculum development. The project involves PhD students as well as undergraduate students, engages non-Computer Science undergraduate students in utility evaluation and corpus annotation, and attracts elementary school and high school students by tutorials, regular research seminars and an extensive summer workshop. The results of this project will also have a benefit in E-Science and E-Learning by extracting and tracking the related knowledge from scientific literature and learning materials used in elementary schools and high schools. Project results, including open source software, task definition guidelines, annotated corpora, scoring metrics will be disseminated via project Web site (http://nlp.cs.qc.cuny.edu/blendeet.html).

View original record on NSF Award Search →