Enabling access to printed text for blind people via assisted mobile OCR
University Of California Santa Cruz, Santa Cruz CA
Investigators
Linked publications & trials
Abstract
DESCRIPTION (provided by applicant): This application proposes new technology development and user studies aiming to facilitate the use of mobile Optical Character Recognition (OCR) for blind people. Mobile OCR systems, implemented as smartphones apps, have recently appeared on the market. This technology unleashes the power of modern computer vision algorithms to enable a blind person to hear (via synthetic speech) the content of printed text imaged by the smartphone's camera. Unlike traditional OCR, that requires scanning of a document with a flatbed scanner, mobile OCR apps enable access to text anywhere, anytime. Using their own smartphones, blind people can read store receipts, menus, flyers, business cards, utility bills, and many other printed documents of the type normally encountered in everyday life. Unfortunately, current mobile OCR systems suffer from a chicken-and-egg problem, which limits their usability. They require the user to take a well-framed snapshot of the document to be scanned, with the full text in view, and at a close enough distance that each character can be well resolved and thus readable by the machine. However, taking a good picture of a document is difficult without sight, and thus without the ability to look at the scene being imaged by the camera through the smartphone's screen. Anecdotal evidence, supported by results of preliminary studies conducted by the principal investigator's group, confirms that acquisition of an OCR-readable image of a document can indeed by very challenging for some blind users. We plan to address this problem by developing and testing a new technique of assisted mobile OCR. As the user aims the camera at the document, the system analyzes in real time the stream of images acquired by the camera, and determines how the camera position and orientation should be adjusted so that an OCR-readable image of the document can be acquired. This information is conveyed to the user via a specially designed acoustic signal. This acoustic feedback allows users to quickly adjust and reorient the camera or the document, resulting in reduced access time and in more satisfactory user experience. Multiple user studies with blind participants are planned with the purpose of selecting an appropriate acoustic interface and of evaluating the effectiveness of the proposed assisted mobile OCR modality.
View original record on NIH RePORTER →