The beauty in a beast: Minimising the effects of diverse recording quality on vowel formant measurements in sociophonetic real-time studies

Citation data:

Speech Communication, ISSN: 0167-6393, Vol: 86, Page: 24-41

Publication Year:
Usage 486
Abstract Views 465
Link-outs 21
Captures 22
Readers 21
Exports-Saves 1
Social Media 10
Tweets 10
Citations 1
Citation Indexes 1
Tamara Rathcke; Jane Stuart-Smith; Bernard Torsney; Jonathan Harrington
Elsevier BV
Computer Science; Mathematics; Social Sciences; Arts and Humanities
Most Recent Tweet View All Tweets
article description
Sociophonetic real-time studies of vowel variation and change rely on acoustic analyses of sound recordings made at different times, often using different equipment and data collection procedures. The circumstances of a recording are known to affect formant tracking and may therefore compromise the validity of conclusions about sound changes made on the basis of real-time data. In this paper, a traditional F1/F2-analysis using linear predictive coding (LPC) was applied to the vowels /i u a/ extracted from spontaneous speech corpora of Glaswegian vernacular, that were recorded in the 1970s and 2000s. We assessed the technical quality of each recording, concentrating on the average levels of noise and the properties of spectral balance, and showed that the corpus comprised of mixed quality data. A series of acoustic vowel analyses subsequently unveiled that formant measurements using LPC were sensitive to the technical specification of a recording, with variable magnitudes of the effects for vowels of different qualities. We evaluated the performance of three commonly used formant normalisation procedures (Lobanov, Nearey and Watt-Fabricius) as well as normalisations by a distance ratio metric and statistical estimation, and compared these results to raw Bark-scaled formant data, showing that some of the approaches could ameliorate the impact of technical issues better than the others. We discuss the implications of these results for sociophonetic research that aims to minimise extraneous influences on recorded speech data while unveiling gradual, potentially small-scale sound changes across decades.