RI: Small: Fast, Scalable Joint Inference for NLP using Markov Logic

$360,348FY2015CSENSF

University Of Texas At Dallas, Richardson TX

Investigators

Abstract

Many fundamental tasks in natural language processing (NLP) such as coreference resolution and event extraction involve complex output constraints. Markov Logic Networks (MLNs), a joint inference framework that combines logical and probabilistic representations, enable manual specification of such constraints in a compact manner, effectively allowing easy incorporation of background knowledge into NLP systems to improve their performance. While theoretically appealing, MLNs have been relatively underused in NLP applications. Owing to issues of scalability, researchers have largely restricted themselves to simple MLNs that either make simplifying, sometimes unreasonable assumptions or ignore complex output constraints. This project seeks to bring transformative changes to the way joint inference is applied in NLP. The idea is to develop fast, scalable learning and inference techniques for MLNs so that rich models (i.e., models with high-dimensional features and/or complex output constraints) can be efficiently trained and applied to large data sets. A key component of the project is the formulation and evaluation of rich MLN-based models for important and complex NLP tasks such as coreference resolution. These rich models, especially when trained on large data sets, can potentially yield breakthrough results in NLP, which in turn can have profound societal impact. For example, improvements in coreference technologies stand to benefit essentially all NLP applications the general public relies on every day, such as search, information extraction, and question answering. Successful application of rich MLN-based models to complex NLP tasks requires the development of fast, scalable learning and inference techniques. To scale up weight learning in MLNs, this project develops approaches that leverage advanced algorithms from the constraint satisfaction literature for fast, approximate solution counting. To scale up probabilistic inference, it employs lifted inference algorithms to reduce the domain size of variables in MLNs by exploiting exact as well as approximate symmetries (e.g., paraphrases). The core NLP tasks it focuses on, such as coreference resolution and temporal relation extraction, are sufficiently complex that they provide convincing testbeds for evaluating the scalability of these learning and inference techniques. Equally importantly, as approximate language is a phenomenon that occurs across NLP tasks, these advances are likely to impact a wide swath of tasks in NLP.

View original record on NSF Award Search →