Multisensory integration of faces and voices in the primate temporal lobe

$340,912R01FY2013NSNIH

Princeton University, Princeton NJ

Investigators

Linked publications & trials

Abstract

DESCRIPTION (provided by applicant): The major aim of our research is to understand how dynamic visual and auditory components of vocal expressions (e.g., speech) are combined behaviorally and physiologically to enhance communication. For example, holding a conversation among a group of individuals at a party means that all around you are the sounds of voices, laughter and music. In this mixture of sounds, the problem your brain is confronted with is to deftly detect when a person is saying something and discriminate what she is saying. To make its task easier, our brains do not rely entirely on the person's voice, but also take advantage of the movement of the person's face while she speaks. The motions of the mouth provide spatial and temporal cues. These multidimensional cues enhance detection and discrimination of voices. The focus of our work will be on what role the auditory cortex plays in integrating faces and voices, and how its role may different from that of the more traditional association areas such as the superior temporal sulcus. We have four main hypotheses. First, we hypothesize that the magnitude of the behavioral advantage, in terms of multisensory benefits on reaction times, will relate to the response magnitude and response latency of auditory cortical neurons. To address this, we will record from the lateral belt auditory cortex during the performance of audiovisual vocal detection task in noise. Second, we predict that the auditory cortex will show a rhythm preference for normal speech relative to slowed or sped up speech and that this preference will also result in greater spiking output, greater spike-speech phase locking or both. Third, we hypothesize that the role of this rhythm is to chunk the auditory signal into manageable units to allow for further, more efficient processing of the fine structure of vocalizations. We will then test the possibility that a rhythmic visual signal could compensate for disruptions in the rhythmicity of the auditory component of vocalizations; we will test this both behaviorally and at the level of auditory cortical signals. Fourth, we hypothesize that processes occurring in the superior temporal sulcus during the same detection and discrimination tasks will be different from those occurring in the auditory cortex. This difference will be primarily because the superior temporal sulcus receives supra-threshold inputs from both the auditory and visual modalities, whereas the auditory cortex only receives a modulatory, sub-threshold influence from the visual modality. Our work has the potential to illuminate the neurophysiological mechanisms that go awry in a number of communication disorders. First, relative to typically-developing children, children with autism spectrum disorders exhibit impaired neural processing and impaired detection of audiovisual speech in noisy backgrounds. Second, a recent theory of dyslexia suggests that dyslexics are impaired at linking phonological sounds with vision. Third, relative to normal individuals, schizophrenic patients are particularly impaire at discriminating audiovisual versus auditory-only speech in noisy backgrounds. One likely substrate for these impairments is the temporal lobe, where faces and voices are first combined neurophysiologically.

View original record on NIH RePORTER →