Speech Disorders Classification by CNN in Phonetic E-Learning System

Jueting Liu, Chang Ren, Yaoxuan Luan, Sicheng Li, Tianshi Xie, Cheryl Seals*, Marisha Speights Atkins

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Speech disorders may affect the process of phonetic transcriptions. In the Automated Phonetic Transcription-the grading tool (APTgt), a linguistic E-learning system, to reduce the influence of disordered speech in the phonetic exams, we proposed a speech disorders classification module that aims to classify disordered speech and non-disordered speech. The Mel-frequency cepstral coefficients (MFCCs) are utilized to represent the features of the speech sound files. With the two different formats of MFCCs, we adopted two approaches to classifying the MFCCs: calculating the similarity between MFCC values by dynamic time warping (DTW) algorithm and classifying the distances by support vector machine (SVM); directly image classification by the convolutional neural network (CNN). We will focus on the second approach in this paper.

Original languageEnglish (US)
Title of host publicationArtificial Intelligence in HCI - 3rd International Conference, AI-HCI 2022, Held as Part of the 24th HCI International Conference, HCII 2022, Proceedings
EditorsHelmut Degen, Stavroula Ntoa
PublisherSpringer Science and Business Media Deutschland GmbH
Pages557-566
Number of pages10
ISBN (Print)9783031056420
DOIs
StatePublished - 2022
Event3rd International Conference on Artificial Intelligence in HCI, AI-HCI 2022 Held as Part of the 24th HCI International Conference, HCII 2022 - Virtual, Online
Duration: Jun 26 2022Jul 1 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13336 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference3rd International Conference on Artificial Intelligence in HCI, AI-HCI 2022 Held as Part of the 24th HCI International Conference, HCII 2022
CityVirtual, Online
Period6/26/227/1/22

Keywords

  • Convolutional neural network
  • Dynamic time warping
  • E-learning
  • Mel-frequency cepstral coefficients
  • Phonetic transcription
  • Speech disorders

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Speech Disorders Classification by CNN in Phonetic E-Learning System'. Together they form a unique fingerprint.

Cite this