TY - GEN
T1 - Speech Disorders Classification by CNN in Phonetic E-Learning System
AU - Liu, Jueting
AU - Ren, Chang
AU - Luan, Yaoxuan
AU - Li, Sicheng
AU - Xie, Tianshi
AU - Seals, Cheryl
AU - Speights Atkins, Marisha
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Speech disorders may affect the process of phonetic transcriptions. In the Automated Phonetic Transcription-the grading tool (APTgt), a linguistic E-learning system, to reduce the influence of disordered speech in the phonetic exams, we proposed a speech disorders classification module that aims to classify disordered speech and non-disordered speech. The Mel-frequency cepstral coefficients (MFCCs) are utilized to represent the features of the speech sound files. With the two different formats of MFCCs, we adopted two approaches to classifying the MFCCs: calculating the similarity between MFCC values by dynamic time warping (DTW) algorithm and classifying the distances by support vector machine (SVM); directly image classification by the convolutional neural network (CNN). We will focus on the second approach in this paper.
AB - Speech disorders may affect the process of phonetic transcriptions. In the Automated Phonetic Transcription-the grading tool (APTgt), a linguistic E-learning system, to reduce the influence of disordered speech in the phonetic exams, we proposed a speech disorders classification module that aims to classify disordered speech and non-disordered speech. The Mel-frequency cepstral coefficients (MFCCs) are utilized to represent the features of the speech sound files. With the two different formats of MFCCs, we adopted two approaches to classifying the MFCCs: calculating the similarity between MFCC values by dynamic time warping (DTW) algorithm and classifying the distances by support vector machine (SVM); directly image classification by the convolutional neural network (CNN). We will focus on the second approach in this paper.
KW - Convolutional neural network
KW - Dynamic time warping
KW - E-learning
KW - Mel-frequency cepstral coefficients
KW - Phonetic transcription
KW - Speech disorders
UR - http://www.scopus.com/inward/record.url?scp=85131122952&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85131122952&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-05643-7_36
DO - 10.1007/978-3-031-05643-7_36
M3 - Conference contribution
AN - SCOPUS:85131122952
SN - 9783031056420
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 557
EP - 566
BT - Artificial Intelligence in HCI - 3rd International Conference, AI-HCI 2022, Held as Part of the 24th HCI International Conference, HCII 2022, Proceedings
A2 - Degen, Helmut
A2 - Ntoa, Stavroula
PB - Springer Science and Business Media Deutschland GmbH
T2 - 3rd International Conference on Artificial Intelligence in HCI, AI-HCI 2022 Held as Part of the 24th HCI International Conference, HCII 2022
Y2 - 26 June 2022 through 1 July 2022
ER -