RI: Small: Robust Automatic Speech Recognition in Highly Reverberant Environments

$450,000FY2009CSENSF

Carnegie Mellon University, Pittsburgh PA

Investigators

Abstract

Speech processing systems, including automatic speech recognition and speaker identification, are the key enabling technologies that permit natural interaction between humans and intelligent machines such as humanoid robots, automated information providers, and similar devices. For example, it is now commonplace to encounter speech-based intelligent agents handle at least the initial part of a query in many types of call center applications. While we have made great progress over the past two decades in overcoming the effects of additive noise in many practical environments, the failure to develop techniques that overcome the effects of reverberation in homes, classrooms, and public spaces is the major reason why automated speech technology remains unavailable for general use in these venues. Reverberation remains one the most difficult unsolved problems in speech recognition in open acoustical environments. This project develops novel approaches that combat the effects of reverberation through two complementary perspectives: contemporary knowledge of human auditory processing and state-of-the art application of statistical source separation techniques that build on techniques in image and music processing. The synergistic development of these approaches is expected to provide substantially improved speech recognition and speaker identification accuracy in reverberant acoustical environments, along with a principled structure that enables us to understand on a much deeper level why the solutions to these problems are effective. This work is expected to have an enormous impact in extending the applicability of automatic recognition of natural and casual speech to highly reverberant environments.

View original record on NSF Award Search →