TY - GEN
T1 - Detecting non-modal phonation in telephone speech
AU - Yoon, Tae Jin
AU - Cole, Jennifer
AU - Hasegawa-Johnson, Mark
PY - 2008
Y1 - 2008
N2 - Non-modal phonation conveys both linguistic and paralinguistic information, and is distinguished by acoustic source and filter features. Detecting non-modal phonation in speech requires reliable F0 analysis, a problem for telephone-band speech, where F0 analysis frequently fails. We demonstrate an approach to the detection of creaky phonation in telephone speech based on robust F0 and spectral analysis. Our F0 analysis relies on an autocorrelation algorithm applied to the intensity-boosted and inverse-filtered speech signal and succeeds in regions of nonmodal phonation where the non-filtered F0 analysis typically fails. In addition to the extracted F0 values, spectral amplitude is measured at the first two harmonics (H1, H2) and the first three formants (A1, A2, A3). Visual and spectral inspection of the detected creaky phonation confirms the findings reported from laboratory setting. Statistical analysis using oneway ANOVA and classification using Support Vector Machine (SVM) reveals promising results which lead to further improvement for automatic detection of non-modal phonation in telephone speech.
AB - Non-modal phonation conveys both linguistic and paralinguistic information, and is distinguished by acoustic source and filter features. Detecting non-modal phonation in speech requires reliable F0 analysis, a problem for telephone-band speech, where F0 analysis frequently fails. We demonstrate an approach to the detection of creaky phonation in telephone speech based on robust F0 and spectral analysis. Our F0 analysis relies on an autocorrelation algorithm applied to the intensity-boosted and inverse-filtered speech signal and succeeds in regions of nonmodal phonation where the non-filtered F0 analysis typically fails. In addition to the extracted F0 values, spectral amplitude is measured at the first two harmonics (H1, H2) and the first three formants (A1, A2, A3). Visual and spectral inspection of the detected creaky phonation confirms the findings reported from laboratory setting. Statistical analysis using oneway ANOVA and classification using Support Vector Machine (SVM) reveals promising results which lead to further improvement for automatic detection of non-modal phonation in telephone speech.
UR - http://www.scopus.com/inward/record.url?scp=84902683025&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84902683025&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84902683025
SN - 9780616220030
T3 - Proceedings of the 4th International Conference on Speech Prosody, SP 2008
SP - 33
EP - 36
BT - Proceedings of the 4th International Conference on Speech Prosody, SP 2008
PB - International Speech Communications Association
T2 - 4th International Conference on Speech Prosody 2008, SP 2008
Y2 - 6 May 2008 through 9 May 2008
ER -