ITR: The Vocal Joystick: Voice-based Assistive Technology for Individuals with Motor Impairments

$1,240,000FY2003CSENSF

University Of Washington, Seattle WA

Investigators

Jeffrey A Bilmescontact Howard J Chizeck Katrin Kirchhoff Patricia A Dowden Richard A Wright

Abstract

The goal of this project is to develop a novel device the PIs call the Vocal Joystick (VJ), that will enable individuals with motor impairments to make use of vocal parameters to control objects on a computer screen (buttons, sliders, etc.) and ultimately also electro-mechanical instruments such as robotic arms and wireless home automation devices. Standard spoken language can be quite inefficient for such continuous control tasks, and is often recognized poorly by automatic speech recognizers. Existing assistive devices, such as sip and puff switches, have extremely low communication bandwidth and can therefore only be used for limited tasks. The Vocal Joystick, by contrast, will allow users to exploit a large and varied set of vocalizations whose selection will be optimized with respect to reliable automatic recognizability, communication bandwidth, learnability, and ease of use. This set may include regular speech sounds, such as vowels and consonants, but the primary focus will be on the variation of individual acoustic-phonetic parameters like pitch, loudness, vowel quality, and voice quality. The PIs will select the basic VJ sound set, and will develop high-accuracy and low-latency acoustic processing techniques for the members of the set. They will design customizable libraries and application programmer interfaces that facilitate the application of VJ technology to a variety of control tasks. And they will test the new technology on the target population group, i.e., users with motor impairments. This research will significantly advance our understanding of human interface technology in general, and speech-based technology in particular. Eschewing the common view that speech as used in everyday human-human communication is also the best way to interact with computers or physical devices, the PIs claim exploiting the full range of possible vocalizations is not only a necessary extension to standard speech recognition but is, in certain contexts, superior to it. This view entails a wealth of open research issues, since non-speech vocalizations have traditionally been regarded as "garbage'', an impediment rather than an aid to speech-based human-computer communication. Particularly challenging tasks are the design of novel signal processing and recognition algorithms that operate without relying on the contextual information found in natural speech, the development of vocal widgets (an extension to the standard concept of user interface widgets), and exploring human ability to use multiple continuous and discrete control signals simultaneously. Broader Impacts: This project will have a significant impact in a variety of ways. The research is itself an outreach activity, in that its goal is to develop a new type of human-computer interface for users with motor impairments; the resulting technology will allow these users to lead more independent and productive lives. Individuals with motor impairments will be included in this project at all stages, not only as participants in usability tests but also as undergraduate research team members. School-age children will be included in the early stages of the VJ design. All software resulting from this project will be documented and distributed via the Internet, so that individuals can use it to configure their own VJ interfaces.

View original record on NSF Award Search →