Camera-based Text Recognition from Complex Backgrounds for the Blind or Visually

$190,000R21FY2010EYNIH

City College Of New York, New York NY

Investigators

Linked publications & trials

Paper 25531008 Paper 25285325 Paper 25285042 Paper 23914345 Paper 23914344 Paper 23894729 Paper 23710105 Paper 23630409 Paper 23316111 Paper 22661884 Paper 22614647 Paper 22523465 Paper 21411405

Abstract

There are more than 10 million blind and visually impaired people living in America today. Recent technology developments in computer vision, digital cameras, and portable computers make it possible to assist these individuals by developing camera-based products that combine computer vision technology with other existing products. Although a number of reading assistants have been designed specifically for people who are blind or visually impaired, reading text from complex backgrounds or non-flat surfaces is very challenging and has not yet been successfully addressed. Many everyday tasks involve these challenging conditions, such as reading instructions on vending machines, titles of books aligned on a shelf, instructions on medicine bottles or labels on soup cans. This proposal focuses on the development of new computer vision algorithms to recognize text from complex backgrounds: 1) from backgrounds with multiple different colors (e.g .. the titles of books lined up on a shelf) and 2) from non-flat surfaces (e.g .. labels on medicine bottles or soup cans). The newly developed computer vision techniques will be integrated with off-the-shelf optical character recognition (OCR) and speech-synthesis software products. Visual information will be captured via a head-mounted camera (on sunglasses or hat) and analyzed by a portable computer (PDA or cell phone), while the speech display will be outputted via mini speakers, earphones, or Bluetooth device. A practical reading system prototype will be produced to read text from complex backgrounds and non-flat surfaces. The system will be cost-effective since it requires only a head mounted camera (<US$100 for 1M resolution), a wearable computer (<US$300), and two mini-speakers or earphones. The price of "ReadIRlS" [74] OCR software is under $150 and the "TextAloud" speech synthesis software is about $30 [75]. This project will be executed over two years at the City College of New York (CCNY) and Lighthouse International, New York. CCNY, located in the Harlem neighborhood of New York City, is designated as both a Minority Institution and a Hispanic-serving Institution (37% Hispanic and 27% African American). Lighthouse International is a leading non-profit organization dedicated to preserving vision and to providing critically needed vision and rehabilitation services to help people of all ages overcome the challenges of vision loss. During the two years, we will 1) develop new algorithms to recognize text from backgrounds with multiple different colors;2) develop new algorithms to recognize text from non-flat surfaces;and 3) develop a cost-effective prototype reading system for blind users by integrating with off-the-shelf optical character recognition (OCR) and speech-synthesis software products. The effectiveness of the prototype and algorithms will be evaluated by people with normal vision and people with vision impairment. A database of text on complex backgrounds (multiple colors and non-flat surfaces) will be created for algorithm and system evaluation. The database will be made available to research communities in the areas of computer vision and vision rehabilitation science. In summary, this effort will provide a research-based foundation to inform the design of next generation reading assistants for blind persons, as well as produce a practical prototype to help the blind user read text from complex backgrounds in real-world environments.

View original record on NIH RePORTER →