Dynamic Neural Mechanisms of Audiovisual Speech Perception
Columbia University Health Sciences, New York NY
Investigators
Linked publications & trials
Abstract
ABSTRACT ? (Title: Dynamic Neural Mechanisms of Audiovisual Speech Perception) Natural speech perception is multisensory; when conversing with someone that we can see, our brains combine visual (V) information from face, postural and hand gestures with auditory (A) information from the voice. The underlying speech processing is extremely rapid, with incoming AV units (e.g., syllables) arriving every few hundred milliseconds that must be encoded and passed on before the next syllable arrives. Finally, this bottom-up sensory information is combined with a top-down cognitive component: what we perceive is strongly influenced by its context. Speech is fundamentally human, and thus, its brain mechanisms are usually studied with noninvasive fMRI, EEG and MEG. Because each method has critical limitations in spatial or temporal resolution, identifying the specific brain mechanisms of speech perception - AV integration, precise and rapid information encoding and top-down control - is a nearly intractable problem. This three-year U01 project will sidestep the problem using direct recording of neuron ensemble (electrocorticographic or ECoG) activity and single neuron activity, along with direct stimulation of selected sites in the brains of surgical epilepsy patients as they process AV speech. Our collaborative ECoG team embodies expertise in multisensory integration and speech perception and leverages the skills and perspectives of neuroscientists, neurosurgeons, engineers, neuropsychologists, neurologists, and ethicists across three leading epilepsy centers: Columbia University Medical Center, Baylor College of Medicine and Northshore-Long Island Jewish Medical Center. By combining the expertise and patients available at all three centers, we will be able to tackle problems that are inaccessible to individual investigators. Our overarching hypothesis, building on our past work and supported by preliminary data, is that fluctuations in the excitability of neurons?oscillations?play a key role in speech processing. Aim 1 tests the general hypothesis that delta/theta range (2-8 Hz) neuronal oscillations play a key role in the integration of auditory and visual speech information. Aim 2 tests the general hypothesis that high-frequency activity (50 Hz and above) encodes representations of auditory and visual speech information, reflecting both bottom-up and top-down influences on perception. The concept employed in this proposal of oscillatory dynamics as mechanistic instruments (Aim 1) that organize the encoding of information in neuronal firing patterns under dynamic top-down control (Aim 2) are part of a paradigm shift in speech science. The broad goal of this proposal is to contribute key foundations for this new paradigm, and set the stage for a comprehensive understanding of the brain circuits and physiological processes underlying natural speech perception, including complex social settings.
View original record on NIH RePORTER →