RI: Small: Coordinating Language Modeling, Computer Vision, and Machine Learning for Dramatic Advances in Optical Character Recognition

$487,395FY2009CSENSF

University Of Massachusetts Amherst, Amherst MA

Investigators

Erik Learned-Millercontact Andrew K McCallum

Abstract

The goal of this research is to develop new methods for improving the performance of optical character recognition (OCR) systems. In particular, the PI investigates "iterative contextual modeling", an approach to OCR in which high confidence recognitions of easier document portions are used to help in developing document specific models. These models can be related to appearance--for example a sample of correct words can be used to develop a model for the font in a particular document. In addition, the models can be based on language and vocabulary information. For example, after recognizing a portion of the words in a document, the general topic of the document may be detected, at which point the distribution over likely words in the document can be changed. The ability to modify character appearance distributions and language statistics and tune them specifically to the document at hand is expected to produce significant increases in the quality of OCR results.

View original record on NSF Award Search →