GGrantIndex
← Search

CRII: RI: Learning a Timely Semantic Resource from Social Media Data

$61,009FY2020CSENSF

Georgia Tech Research Corporation, Atlanta GA

Investigators

Abstract

One key challenge in text mining and natural language processing research is that a single meaning can be expressed in many different ways, i.e., paraphrases. There has been steady progress towards large paraphrase resources, and a significant increase in its applications: from information retrieval, information extraction, and natural language generation to IBM's Watson, Google's Knowledge Graph, and many more. This research aims to create better paraphrase acquisition techniques and larger scale semantic resources, which could be of great use in various natural language processing tasks and social media data analytics in social science, national security, and other related fields. One example of potential applications is text simplification, which automatically rephrases complex texts into simpler language for children or people with reading disabilities. The technical innovation of this study focuses on joint modeling of word- and phrase-level alignments between sentence pairs to address the challenges of extracting semantic knowledge from informal data sources (such as social media), which exist in very large quantities rather than just formal sources, such as newswire as per previous work. The model design extends multiple instance learning via two methods, a graphical model and neural network, and can flexibly permit the exploration of different assumptions and models the importance of words or phrases. The modeling advancements can be generalized to other natural language understanding tasks, which require analyzing sentences based on word-level composition or word meaning in a given context, and natural language generation tasks that benefit from learning what words and phrases to remove or rephrase. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →