Modeling and recognition of phonetic and prosodic factors for improvements to acoustic speech recognition models

Sarah Borys, Mark Hasegawa-Johnson, Jennifer Cole, Aaron Cohen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper examines the usefulness of including prosodic and phonetic context information in the phoneme model of a speech recognizer. This is done by creating a series of prosodic and phonetic models and then comparing the mutual information between the observations and each possible context variable. Prosodic variables show improvement less often than phone context variables, however, prosodic variables generally show a larger increase in mutual information. A recognizer with allophones defined using the maximum mutual information prosodic and phonetic variables outperforms a recognizer with allophones defined exclusively using phonetic variables.

Original languageEnglish (US)
Title of host publication8th International Conference on Spoken Language Processing, ICSLP 2004
PublisherInternational Speech Communication Association
Pages3013-3016
Number of pages4
StatePublished - Jan 1 2004
Event8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of
Duration: Oct 4 2004Oct 8 2004

Other

Other8th International Conference on Spoken Language Processing, ICSLP 2004
Country/TerritoryKorea, Republic of
CityJeju, Jeju Island
Period10/4/0410/8/04

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Modeling and recognition of phonetic and prosodic factors for improvements to acoustic speech recognition models'. Together they form a unique fingerprint.

Cite this