TY - GEN
T1 - A phone-viseme dynamic Bayesian network for audio-visual automatic speech recognition
AU - Terry, Louis
AU - Katsaggelos, Aggelos K.
PY - 2008
Y1 - 2008
N2 - This work extends and improves a recently introduced (Dec. 2007) dynamic Bayesian network (DBN) based audio-visual automatic speech recognition (AVASR) system. That system models the audio and visual components of speech as being composed of the same sub-word units when, in fact, this is not psycholinguistically true. We extend the system to model the audio and visual streams as being composed of separate, yet related, sub-word units. We also introduce a novel stream weighting structure incorporated into the model itself In recognition accuracy in a large vocabulary continuous speech recognition task (LVCSR). The "best" performing proposed system attains a WER of 66.71% whereas the "best" baseline system performs at a WER of 64.30%. The proposed system also improves accuracy to 45.95%from 39.40%.
AB - This work extends and improves a recently introduced (Dec. 2007) dynamic Bayesian network (DBN) based audio-visual automatic speech recognition (AVASR) system. That system models the audio and visual components of speech as being composed of the same sub-word units when, in fact, this is not psycholinguistically true. We extend the system to model the audio and visual streams as being composed of separate, yet related, sub-word units. We also introduce a novel stream weighting structure incorporated into the model itself In recognition accuracy in a large vocabulary continuous speech recognition task (LVCSR). The "best" performing proposed system attains a WER of 66.71% whereas the "best" baseline system performs at a WER of 64.30%. The proposed system also improves accuracy to 45.95%from 39.40%.
UR - http://www.scopus.com/inward/record.url?scp=77957944514&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77957944514&partnerID=8YFLogxK
U2 - 10.1109/icpr.2008.4761927
DO - 10.1109/icpr.2008.4761927
M3 - Conference contribution
AN - SCOPUS:77957944514
SN - 9781424421756
T3 - Proceedings - International Conference on Pattern Recognition
BT - 2008 19th International Conference on Pattern Recognition, ICPR 2008
PB - Institute of Electrical and Electronics Engineers Inc.
ER -