Abstract
Does prosody help word recognition? In this paper, we propose a novel probabilistic framework in which word and phoneme are dependent on prosody in a way that improves word recognition. The prosody attribute that we investigate in this study is the lengthening of speech segments in the vicinity of intonational phrase boundaries. Explicit Duration Hidden Markov Model (EDHMM) is implemented to provide an accurate phoneme duration model. This study is conducted on Boston University Radio News Corpus with prosodic boundaries marked using ToBI labelling system. We found that lengthening of the phrase final rhymes can be reliably modelled by EDHMM, which significantly improves the prosody dependent acoustic modelling. Conversely, no systematic duration variation is found at phrase initial position. With prosody dependence implemented in the acoustic model, pronunciation model and language model, both word recognition accuracy and boundary recognition accuracy are improved by 1% over systems without prosody dependence.
Original language | English (US) |
---|---|
Title of host publication | EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology |
Publisher | International Speech Communication Association |
Pages | 393-396 |
Number of pages | 4 |
State | Published - Jan 1 2003 |
Event | 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - Geneva, Switzerland Duration: Sep 1 2003 → Sep 4 2003 |
Other
Other | 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 |
---|---|
Country/Territory | Switzerland |
City | Geneva |
Period | 9/1/03 → 9/4/03 |
ASJC Scopus subject areas
- Computer Science Applications
- Software
- Linguistics and Language
- Communication