RI: Small: Acquiring Domain Knowledge from Text through Cooperative Bootstrapping
University Of Utah, Salt Lake City UT
Investigators
Abstract
Some of the most pressing needs for natural language processing (NLP) technology come from specialized domains where broad-coverage solutions are not sufficient, such as clinical medicine and molecular biology. This project focuses on the development of bootstrapped learning techniques to rapidly create domain-specific semantic analyzers, and the automatic harvesting of domain knowledge from unstructured text. This project establishes a new cooperative bootstrapping paradigm to learn semantic analyzers for different tasks simultaneously by allowing classifiers for different tasks to learn from each other. These analyzers then populate a domain event graph with semantic information extracted from a domain-specific text collection. New knowledge harvesting algorithms acquire domain-specific facts and inference rules from the graph. This project explores the domain of veterinary medicine using message board posts by veterinarians to acquire real-world knowledge for the purposes of animal health surveillance. This work will advance the state-of-the-art in natural language technology by developing a new bootstrapping framework to rapidly create semantic analysis tools for specialized domains. This technology will impact many NLP applications, including information extraction, question answering, and summarization. The knowledge harvesting tools will be made publicly available to allow for direct impact across many disciplines that have a need to extract knowledge from unstructured text collections. This project will also benefit society by creating new tools for animal health surveillance, which could provide early warning signs of zoonotic disease outbreaks (such as bird flu and mad cow disease), exposures to toxic substances, and contamination in the food chain.
View original record on NSF Award Search →