Subjective analysis of an HMM-based visual speech synthesizer

J. J. Williams*, A. K. Katsaggelos, D. C. Garstecki

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

4 Scopus citations


Emerging broadband communication systems promise a future of multimedia telephony. The addition of visual information, for example, during telephone conversations would be most beneficial to people with impaired hearing and the ability to speechread. For the present, it is useful to consider the problem of generating the critical information useful for speechreading, based on existing narrowband communications systems used for speech. This paper focuses on the problem of synthesizing visual articulatory movements given the acoustic speech signal. A Hidden Markov Model (HMM)-based visual speech synthesizer is designed to improve speech understanding. The key elements in the application of HMMs to this problem are: a) the decomposition of the overall modeling task into key stages; and, b) the judicious determination of the components of the observation vector for each stage. The main contribution of this paper is the development of a novel correlation HMM model that is able to integrate independently trained acoustic and visual HMMs for speech-to-visual synthesis. This model allows increased flexibility in choosing model topologies for the acoustic and visual HMMs. It also reduces the amount of required training data compared to early integration modeling techniques. Results from objective and subjective analysis show that an HMM correlating model can significantly decrease audio-visual synchronization errors and increase speech understanding.

Original languageEnglish (US)
Pages (from-to)544-555
Number of pages12
JournalProceedings of SPIE - The International Society for Optical Engineering
StatePublished - 2001
EventHuman Vision and Electronic Imaging VI - San Jose, CA, United States
Duration: Jan 22 2001Jan 25 2001

ASJC Scopus subject areas

  • Electronic, Optical and Magnetic Materials
  • Condensed Matter Physics
  • Computer Science Applications
  • Applied Mathematics
  • Electrical and Electronic Engineering


Dive into the research topics of 'Subjective analysis of an HMM-based visual speech synthesizer'. Together they form a unique fingerprint.

Cite this