Biometrics has been a topic of great interest since the advent of the information age and will soon lead to a safer and simpler lifestyle where passcodes and keys are inherent to the user. We describe a system capable of automatically extracting visual features from a human face for use in dynamic visual biometrics. Automatic speech and speaker recognition has recently moved towards incorporating visual information to improve upon audio-only recognition systems. With few exceptions, however, investi-gations into audio-visual and visual-only automatic speech and speaker recognition have utilized ideal visual databases in their audio-visual (AV-ASR) and visual-only automatic speech recognition (V-ASR) experiments. Our system incorporates robust and efficient computer vision algorithms to automatically detect, track and identify a speaker based on visual features extracted from the speaker's mouth region. The features are extracted in real-time, in adverse visual conditions. The system recognition perfor-mance is evaluated by comparing speaker recognition results found using automatic tracking data with those found using ground truth tracking data. Speaker recognition results found using ground truth and automatic tracking data are 52.3% and 59.3%, respectively. The results are discussed and future improvements and experiments are suggested.
|Title of host publication||Proceedings of 45th Annual Allerton Conference on Communication, Control, and Computing|
|State||Published - 2007|
|Event||Proceedings of 45th Annual Allerton Conference on Communication, Control, and Computing - Monticello, IL|
Duration: Sep 1 2007 → …
|Conference||Proceedings of 45th Annual Allerton Conference on Communication, Control, and Computing|
|Period||9/1/07 → …|