Video and audio data integration for conferencing

Thrasyvoulos N. Pappas*, Raynard O. Hinds

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

In videoconferencing applications the perceived quality of the video signal is affected by the presence of an audio signal (speech). To achieve high compression rates, video coders must compromise image quality in terms of spatial resolution, grayscale resolution, and frame rate, and may introduce various kinds of artifact.s We consider tradeoffs in grayscale resolution and frame rate, and use subjective evaluations to assess the perceived quality of the video signal in the presence of speech. In particular we explore the importance of lip synchronization. In our experiment we used an original grayscale sequence at QCIF resolution, 30 frames/second, and 256 gray levels. We compared the 256-level sequence at different frame rates with a two-level version of the sequence at 30 frames/sec. The viewing distance was 20 image heights, or roughly two feet from an SGI workstation. We used uncoded speech. To obtain the two-level sequence we used an adaptive clustering algorithm for segmentation of video sequences. The binary sketches it creates move smoothly and preserve the main characteristics of the face, so that it is easily recognizable. More importantly, the rendering of lip and eye movements is very accurate. The test results indicate that when the frame rate of the full grayscale sequence is low (less than 5 frames/sec), most observers prefer the two-level sequence.

Original languageEnglish (US)
Title of host publicationProceedings of SPIE - The International Society for Optical Engineering
PublisherSociety of Photo-Optical Instrumentation Engineers
Pages120-127
Number of pages8
ISBN (Print)0819417580
StatePublished - Jan 1 1995
EventHuman Vision, Visual Processing, and Digital Display VI - San Jose, CA, USA
Duration: Feb 6 1995Feb 8 1995

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume2411
ISSN (Print)0277-786X

Other

OtherHuman Vision, Visual Processing, and Digital Display VI
CitySan Jose, CA, USA
Period2/6/952/8/95

ASJC Scopus subject areas

  • Electronic, Optical and Magnetic Materials
  • Condensed Matter Physics
  • Computer Science Applications
  • Applied Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Video and audio data integration for conferencing'. Together they form a unique fingerprint.

Cite this