A phone-viseme dynamic Bayesian network for audio-visual automatic speech recognition

Louis Terry*, Aggelos K Katsaggelos

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Scopus citations

Abstract

This work extends and improves a recently introduced (Dec. 2007) dynamic Bayesian network (DBN) based audio-visual automatic speech recognition (AVASR) system. That system models the audio and visual components of speech as being composed of the same sub-word units when, in fact, this is not psycholinguistically true. We extend the system to model the audio and visual streams as being composed of separate, yet related, sub-word units. We also introduce a novel stream weighting structure incorporated into the model itself In recognition accuracy in a large vocabulary continuous speech recognition task (LVCSR). The "best" performing proposed system attains a WER of 66.71% whereas the "best" baseline system performs at a WER of 64.30%. The proposed system also improves accuracy to 45.95%from 39.40%.

Original languageEnglish (US)
Title of host publication2008 19th International Conference on Pattern Recognition, ICPR 2008
StatePublished - Dec 1 2008
Event2008 19th International Conference on Pattern Recognition, ICPR 2008 - Tampa, FL, United States
Duration: Dec 8 2008Dec 11 2008

Publication series

NameProceedings - International Conference on Pattern Recognition
ISSN (Print)1051-4651

Other

Other2008 19th International Conference on Pattern Recognition, ICPR 2008
CountryUnited States
CityTampa, FL
Period12/8/0812/11/08

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition

Fingerprint Dive into the research topics of 'A phone-viseme dynamic Bayesian network for audio-visual automatic speech recognition'. Together they form a unique fingerprint.

  • Cite this

    Terry, L., & Katsaggelos, A. K. (2008). A phone-viseme dynamic Bayesian network for audio-visual automatic speech recognition. In 2008 19th International Conference on Pattern Recognition, ICPR 2008 [4761927] (Proceedings - International Conference on Pattern Recognition).