TY - JOUR
T1 - Automatic analysis of slips of the tongue
T2 - Insights into the cognitive architecture of speech production
AU - Goldrick, Matthew
AU - Keshet, Joseph
AU - Gustafson, Erin
AU - Heller, Jordana
AU - Needle, Jeremy
N1 - Funding Information:
Supported by National Science Foundation Grant BCS0846147 and National Institutes of Health Grant HD077140 . Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do no necessarily reflect the views of the NSF or the NIH. Thanks to the Northwestern SoundLab and Jennifer Culbertson for helpful discussion and comments.
Publisher Copyright:
© 2016 Elsevier B.V.
PY - 2016/4/1
Y1 - 2016/4/1
N2 - Traces of the cognitive mechanisms underlying speaking can be found within subtle variations in how we pronounce sounds. While speech errors have traditionally been seen as categorical substitutions of one sound for another, acoustic/articulatory analyses show they partially reflect the intended sound. When "pig" is mispronounced as "big," the resulting /b/ sound differs from correct productions of "big," moving towards intended "pig"-revealing the role of graded sound representations in speech production. Investigating the origins of such phenomena requires detailed estimation of speech sound distributions; this has been hampered by reliance on subjective, labor-intensive manual annotation. Computational methods can address these issues by providing for objective, automatic measurements. We develop a novel high-precision computational approach, based on a set of machine learning algorithms, for measurement of elicited speech. The algorithms are trained on existing manually labeled data to detect and locate linguistically relevant acoustic properties with high accuracy. Our approach is robust, is designed to handle mis-productions, and overall matches the performance of expert coders. It allows us to analyze a very large dataset of speech errors (containing far more errors than the total in the existing literature), illuminating properties of speech sound distributions previously impossible to reliably observe. We argue that this provides novel evidence that two sources both contribute to deviations in speech errors: planning processes specifying the targets of articulation and articulatory processes specifying the motor movements that execute this plan. These findings illustrate how a much richer picture of speech provides an opportunity to gain novel insights into language processing.
AB - Traces of the cognitive mechanisms underlying speaking can be found within subtle variations in how we pronounce sounds. While speech errors have traditionally been seen as categorical substitutions of one sound for another, acoustic/articulatory analyses show they partially reflect the intended sound. When "pig" is mispronounced as "big," the resulting /b/ sound differs from correct productions of "big," moving towards intended "pig"-revealing the role of graded sound representations in speech production. Investigating the origins of such phenomena requires detailed estimation of speech sound distributions; this has been hampered by reliance on subjective, labor-intensive manual annotation. Computational methods can address these issues by providing for objective, automatic measurements. We develop a novel high-precision computational approach, based on a set of machine learning algorithms, for measurement of elicited speech. The algorithms are trained on existing manually labeled data to detect and locate linguistically relevant acoustic properties with high accuracy. Our approach is robust, is designed to handle mis-productions, and overall matches the performance of expert coders. It allows us to analyze a very large dataset of speech errors (containing far more errors than the total in the existing literature), illuminating properties of speech sound distributions previously impossible to reliably observe. We argue that this provides novel evidence that two sources both contribute to deviations in speech errors: planning processes specifying the targets of articulation and articulatory processes specifying the motor movements that execute this plan. These findings illustrate how a much richer picture of speech provides an opportunity to gain novel insights into language processing.
KW - Automatic phonetic analysis
KW - Machine learning
KW - Speech errors
KW - Speech production
KW - Structured prediction
UR - http://www.scopus.com/inward/record.url?scp=84953791763&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84953791763&partnerID=8YFLogxK
U2 - 10.1016/j.cognition.2016.01.002
DO - 10.1016/j.cognition.2016.01.002
M3 - Article
C2 - 26779665
AN - SCOPUS:84953791763
SN - 0010-0277
VL - 149
SP - 31
EP - 39
JO - Cognition
JF - Cognition
ER -