Automated Large-Scale Phonetic Analysis: DASS Pilot

$377,295FY2016SBENSF

University Of Georgia Research Foundation Inc, Athens GA

Investigators

William A Kretzschmarcontact Margaret E Renwick

Abstract

Generalizations about language contained in dictionaries and grammars hide the extensive variation in the way that speakers actually use language. However, modern technology now makes it possible to use automated means to extract variation in pronunciation from spoken interviews. This research project uses available software to process sixty-four interviews with speakers from Florida, Georgia, Tennessee, Alabama, Mississippi, Louisiana, Arkansas, and Texas recorded from 1968-1983. These interviews constitute a geographic and social sample of speakers across the Gulf States. All of the transcriptions of the interviews, the vowel pronunciation data, and the visualizations will be presented on the website of the Linguistic Atlas Project. Detailed data on actual speaker variation addresses the industrial methods currently used for speech recognition and speech synthesis. This project will be the first large-scale test of the complex systems model against acoustic phonetic data. The legacy interviews consist of over 200 Gb of files containing 372 hours of digital audio interviews. In the first stage of the research, vowel pronunciations will be extracted from a list of seventy-eight different words that were elicitation targets in the interviews, plus additional words found to occur frequently in the interviews such as color terms, up to a total of three hundred words. The resulting data set will have approximately 22,500 vowel tokens per interview, nearly 1,500,000 tokens across the data set, a very large corpus of data on Southern American English. The second stage of the project will create visualizations of these tokens to determine the dimensions of variation in the realization of vowels per speaker, social category, and geographic area. The science of complex systems will be employed as a model in the analysis, which predicts that the wide range of realizations that occurs in the groups under analysis will be self-organized into nonlinear distributional patterns. The extraction and display of the full range of vowel variation has the potential to improve industrial methods used for both speech recognition and speech synthesis, as it offers a detailed view of actual variation for speakers and groups rather than assuming a consistent or ?average? realization of vowels.

View original record on NSF Award Search →