Machine classification of acoustic waveforms as speech events is often difficult due to context dependencies. Here a vowel recognition task with multiple speakers is studied via the use of a class of modular and hierarchical systems referred to as mixtures-of-experts and hierarchical mixtures-of-experts models. The statistical model underlying the systems is a mixture model in which both the mixture coefficients and the mixture components are generalized linear models. A full Bayesian approach is used as a basis of inference and prediction. Computations are performed using Markov chain Monte Carlo methods. A key benefit of this approach is the ability to obtain a sample from the posterior distribution of any functional of the parameters of the given model. In this way, more information is obtained than can be provided by a point estimate. Also avoided is the need to rely on a normal approximation to the posterior as the basis of inference. This is particularly important in cases where the posterior is skewed or multimodal. Comparisons between a hierarchical mixtures-of-experts model and other pattern classification systems on the vowel recognition task are reported. The results indicate that this model showed good classification performance and also gave the additional benefit of providing for the opportunity to assess the degree of certainty of the model in its classification predictions.
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty