Abstract
This paper examines the usefulness of including prosodic and phonetic context information in the phoneme model of a speech recognizer. This is done by creating a series of prosodic and phonetic models and then comparing the mutual information between the observations and each possible context variable. Prosodic variables show improvement less often than phone context variables, however, prosodic variables generally show a larger increase in mutual information. A recognizer with allophones defined using the maximum mutual information prosodic and phonetic variables outperforms a recognizer with allophones defined exclusively using phonetic variables.
Original language | English (US) |
---|---|
Title of host publication | 8th International Conference on Spoken Language Processing, ICSLP 2004 |
Publisher | International Speech Communication Association |
Pages | 3013-3016 |
Number of pages | 4 |
State | Published - Jan 1 2004 |
Event | 8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of Duration: Oct 4 2004 → Oct 8 2004 |
Other
Other | 8th International Conference on Spoken Language Processing, ICSLP 2004 |
---|---|
Country/Territory | Korea, Republic of |
City | Jeju, Jeju Island |
Period | 10/4/04 → 10/8/04 |
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language