Acoustic correlates of phonetic perception

$461,697R01FY2003DCNIH

Western Michigan University, Kalamazoo MI

Investigators

Linked publications & trials

Paper 25480760 Paper 21682420 Paper 19525544 Paper 15938059 Paper 15478445 Paper 12597197 Paper 12199395 Paper 11248979 Paper 11144593

Abstract

[unreadable] DESCRIPTION (provided by applicant): Numerous practical and theoretical problems could be addressed if we had a deeper understanding of the auditory and perceptual mechanisms underlying phonetic recognition. Practical applications of this knowledge include improvements to cochlear implant signal processors, the improvement of speech synthesis devices, the development of robust speech recognition algorithms, and the development of training devices for hearing-impaired speakers. The proposed experiments fall into three major categories. An extensive series of experiments is designed to test the validity of a new model of vowel perception that was developed during the previous grant period. The model assumes that vowel identity is recognized by a template-matching process involving the comparison of narrow band input spectra with a set of smoothed spectral-shape templates that are learned through ordinary exposure to speech. An evaluation conducted during the previous grant period showed that the model is capable of recognizing vowels from a large, multitalker database with accuracy approaching that of human listeners. We would like to extend this line of work to address issues such as: (1) modeling the integration of temporal and spectral cues to vowel identity, (2) testing the robustness of the theory to variation in factors such as phonetic environment, signal periodicity, and spectral shape features such as formant amplitude relations, and (3) incorporation of a psychologically plausible normalization scheme that might account for the ability of listeners to recognize speech from talkers with diverse vocal-tract characteristics. A second set of experiments will be run which measure how much phonetic information is transmitted to listeners by speech signals that have been generated by specially designed speech synthesis algorithms that preserve only some characteristics of the original speech signal, while purposely removing or distorting other cues. The results will allow inferences to be drawn about the nature of the underlying spectral representations that mediate phonetic recognition. A third line of work is designed to evaluate a new theory of auditory frequency analysis that was developed during the previous grant period. The theory assumes that the auditory spectrum is calculated not at the periphery but in the central nervous system through an analysis of auditory nerve firing patterns. A software simulation will be developed for use in testing the validity of the theory.

View original record on NIH RePORTER →