Statistical Phrase Extraction Techniques In Databases

$0Z01FY2001LMNIH

National Library Of Medicine

Investigators

Linked publications & trials

Paper 18834487 Paper 18817555 Paper 18080004 Paper 16867190 Paper 16843731 Paper 16779069 Paper 15556479 Paper 15130538 Paper 15073016 Paper 12798042 Paper 11079836 Paper 10984469

Abstract

The ability to locate important phrases in natural language text is useful for the purposes of indexing or placing hyperlinks in text. In either case one seeks to improve access to the textual material. In the past the most common method used for the location of phrases has been a part of speech tagger. We have developed a new approach that uses scoring algorithms to rank phrases as to how useful they may be. A number of different methods have been developed and tested. These are being combined with methods of stemming and of finding inflectional variants of phrases that are synonymous for retrieval purposes. The UMLS system is also being used to find synonymous phrases for indexing. These methods are being applied to find useful phrases in NCBI's electronic textbook project that is currently online but still under development.

View original record on NIH RePORTER →