Doctoral Dissertation Research: Intonational Cues in Emotional Contexts

$10,091FY2022SBENSF

Northwestern University, Evanston IL

Investigators

Abstract

The meaning conveyed through language depends not only on what is said--sounds, words, phrases--but also on how it is said--pitch, loudness, tempo, and other dimensions of intonation. In American English, distinct intonation patterns are used to convey meaning related to the discourse context of an utterance, distinguishing questions from assertions, the continuation of a speaker's turn in a dialogue, or marking emphasis on words that contribute new information. Decoding linguistic meaning from phonetic cues related to intonation is complicated by the fact that the same phonetic parameters are also integral to the speaker’s expression of emotion. How do these two factors—discourse meaning and speaker emotion—interact in determining the intonational form of an utterance, and how do listeners disentangle intonational cues to linguistic meaning from cues to the speaker's emotional state? To answer this question, this project investigates variation in the phonetic encoding of discourse meaning based on speaker emotion, and how listeners use information about a speaker’s emotional state to guide their interpretation of discourse meaning. Phase I of this project examines evidence from speech production experiments that elicit specific intonation patterns in sentences produced under different conditions of enacted emotion. Phase II tests how listeners' perception of speaker emotion affects how they process linguistic intonational categories and the interpretation of discourse meaning. The findings from this study will shed light on the interaction between linguistic context and speaker emotion in the production and perception of intonation. This work adopts the Autosegmental-Metrical model of American English intonation, in which the phonetic implementation of intonation derives from phonological structures that group words into prosodic phrases, and tone features that mark prosodic phrase edges (boundary tones) and phrasal prominence (pitch accents). The investigation focuses on how enacted emotion affects the production of eight phrase-final intonation “tunes” (e.g., falling pitch or rising pitch) that have distinct pragmatic meaning. Tunes are elicited using an imitation paradigm. Participants hear a tune on a model sentence, and then produce the same tune on a new sentence presented in a discourse context congruent with the tune meaning, under four conditions of enacted emotion. The effect of emotion on acoustic correlates of intonation is modeled using Bayesian mixed-effects regression. A subset of recorded utterances from the production experiment are used as stimuli in a series of perception and comprehension experiments. Perceptual discrimination among tunes is tested as a function of enacted emotion using AXB discrimination, and a free classification task where listeners group utterances according to the speaker's inferred communicative goal (e.g., to express surprise, make an assertion). Comprehension of tune meaning is tested through a structured set of questions probing listeners’ pragmatic judgments for each tune as a function of the enacted emotion. This project contributes to linguistic theories of intonational form and meaning, with new insights into speaker emotion as a source of variation in intonation production and perception, and contributes a new corpus of speech recordings, the English IntEX Corpus, for future research on intonation and emotion. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →