TY - GEN
T1 - Speech Disorders Classification in Phonetic Exams with MFCC and DTW
AU - Liu, Jueting
AU - Speights, Marisha
AU - Bailey, Dallin
AU - Li, Sicheng
AU - Zhou, Huanyi
AU - Luan, Yaoxuan
AU - Xie, Tianshi
AU - Seals, Cheryl
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Recognizing disordered speech is a challenge to Automatic Speech Recognition (ASR) systems. This research focuses on classifying disordered speech vs. non-disordered speech through signal processing coupled with machine learning techniques. We have found little evidence of ASR that correctly classifies disordered vs. ordered speech at the level of expert-based classification. This research supports the Automated Phonetic Transcription - Grading Tool (APTgt). APTgt is an online E-Learning system that supports Communications Disorders (CMDS) faculty during linguistic courses and provides reinforcement activities for phonetic transcription with the potential to improve the quality of students' learning efficacy and teachers' pedagogical experience. In addition, APTgt generates interactive practice sessions and exams, automatic grading, and exam analysis. This paper will focus on the classification module to classify disordered speech and non-disordered speech supporting APTgt. We utilize Mel-frequency cepstral coefficients (MFCCs) and dynamic time warping (DTW) to preprocess the audio files and calculate the similarity, and the Support Vector Machine (SVM) algorithm for classification and regression.
AB - Recognizing disordered speech is a challenge to Automatic Speech Recognition (ASR) systems. This research focuses on classifying disordered speech vs. non-disordered speech through signal processing coupled with machine learning techniques. We have found little evidence of ASR that correctly classifies disordered vs. ordered speech at the level of expert-based classification. This research supports the Automated Phonetic Transcription - Grading Tool (APTgt). APTgt is an online E-Learning system that supports Communications Disorders (CMDS) faculty during linguistic courses and provides reinforcement activities for phonetic transcription with the potential to improve the quality of students' learning efficacy and teachers' pedagogical experience. In addition, APTgt generates interactive practice sessions and exams, automatic grading, and exam analysis. This paper will focus on the classification module to classify disordered speech and non-disordered speech supporting APTgt. We utilize Mel-frequency cepstral coefficients (MFCCs) and dynamic time warping (DTW) to preprocess the audio files and calculate the similarity, and the Support Vector Machine (SVM) algorithm for classification and regression.
KW - Dynamic Time Warping
KW - E-Learning
KW - International Phonetic Alphabet
KW - MFCC
KW - Phonetic Transcription
KW - Speech Classification
KW - Support Vector Machine
UR - http://www.scopus.com/inward/record.url?scp=85126863551&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85126863551&partnerID=8YFLogxK
U2 - 10.1109/CIC52973.2021.00015
DO - 10.1109/CIC52973.2021.00015
M3 - Conference contribution
AN - SCOPUS:85126863551
T3 - Proceedings - 2021 IEEE 7th International Conference on Collaboration and Internet Computing, CIC 2021
SP - 35
EP - 40
BT - Proceedings - 2021 IEEE 7th International Conference on Collaboration and Internet Computing, CIC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 7th IEEE International Conference on Collaboration and Internet Computing, CIC 2021
Y2 - 13 December 2021 through 15 December 2021
ER -