III-COR: Searching Archives of Community Knowledge

$554,132FY2007CSENSF

University Of Massachusetts Amherst, Amherst MA

Investigators

Abstract

Croft, W. Bruce University of Massachusetts Amherst III-COR Searching Archives of Community Knowledge The problem of recognizing or finding questions has received little attention, but is an important problem that is conceptually different than typical document search. The use of text transformation models learned from the Q&A archives has the potential of considerably advancing our knowledge of how models that have been very successful in machine translation applications can be applied to improve the effectiveness of search. This project will use two large archives of questions and answers to develop, test, and compare algorithms for more effective retrieval of questions that are semantically similar to the question posed by a current user and look to retrieve answers with a high probability of relevance. One aspect of addressing this challenge will be to apply text transformation techniques developed to support machine translation to the Q&A archives in order to learn probabilities to associate with word substitution in identifying semantically similar queries. The techniques will be evaluated for both English and Korean (since one of the Q&A archives is in Korean). Over a dozen alternative approaches to developing similarity measures will be tried and compared using a variety of performance metrics, primarily metrics that focus on the precision/relevance of results returned at the top of a ranked list. Development of successful approaches to the problem of retrieval of appropriate answers from archives of questions and answers will benefit many application areas

View original record on NSF Award Search →