CAREER: Optimizing Human Speech Perception in Noisy Environments with User-Guided Machine Learning

$338,850FY2020CSENSF

Indiana University, Bloomington IN

Investigators

Abstract

Unwanted background noise often hinders device-mediated communication during the nearly 20 billion yearly video conference calls and for millions of hearing aid users. Approaches are developed to remove unwanted noise, but unfortunately, they do not perform well in many real environments. Subsequently, the noise-removal approaches often provide low quality and unintelligible listening experiences, which results in dissatisfied and frustrated users. This Faculty Early Carrer Development project will develop noise-reduction and assessment approaches that address these issues, resulting in improved listening experiences for users. Individuals and companies that regularly use digital means (e.g. voice conferencing and hearing aids) for person-to-person communication will be major beneficiaries of this work. The data and algorithms that result from this research will be made available to benefit scientists and researchers from diverse and interdisciplinary fields. Additionally, educational activities based on this research will be integrated into various efforts to increase the number of underrepresented participants in these research areas. The main objective of this project is to develop user-guided machine-learning algorithms that result in improved listening experiences in real-world noisy environments. In environments that contain many competing talkers, noise-reduction systems inadvertently remove or retain unintended speech signals. The proposed research activities will address this by (1) developing multi-modal computational approaches that identify the speech signal that a specific user wants to hear. Computational assessment metrics are generally used by researchers to assess performance, but they do not always correlate with individual user sentiment, meaning investigators have inaccurate assessment results. This project will (2) develop an effective interface for capturing and predicting short-time user assessment of quality and intelligibility. Simulated and real-world speech data differ in terms of speaker, noise and environmental characteristics, but current noise-reduction approaches are incapable of adapting to these differences on the fly. This is a major shortcoming as deployed noise-reduction systems will encounter unknown speakers and noises. The investigator will (3) develop a novel class of user-guided machine learning algorithms that utilize true and predicted user assessment in near-real time for system optimization. Successfully completing these tasks will help better understand speech perception and increase the usability of noise-reduction systems. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →