Unsupervised topic modelling for multi-party spoken discourse

Matthew Purver*, Konrad P. Körding, Thomas L. Griffiths, Joshua B. Tenenbaum

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

85 Scopus citations

Abstract

We present a method for unsupervised topic modelling which adapts methods used in document classification (Blei et al., 2003; Griffiths and Steyvers, 2004) to unsegmented multi-party discourse transcripts. We show how Bayesian inference in this generative model can be used to simultaneously address the problems of topic segmentation and topic identification: automatically segmenting multi-party meetings into topically coherent segments with performance which compares well with previous unsupervised segmentation-only methods (Galley et al., 2003) while simultaneously extracting topics which rate highly when assessed for coherence by human judges. We also show that this method appears robust in the face of off-topic dialogue and speech recognition errors.

Original languageEnglish (US)
Title of host publicationCOLING/ACL 2006 - 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
Pages17-24
Number of pages8
Volume1
StatePublished - Dec 1 2006
Event21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, COLING/ACL 2006 - Sydney, NSW, Australia
Duration: Jul 17 2006Jul 21 2006

Other

Other21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, COLING/ACL 2006
CountryAustralia
CitySydney, NSW
Period7/17/067/21/06

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Unsupervised topic modelling for multi-party spoken discourse'. Together they form a unique fingerprint.

  • Cite this

    Purver, M., Körding, K. P., Griffiths, T. L., & Tenenbaum, J. B. (2006). Unsupervised topic modelling for multi-party spoken discourse. In COLING/ACL 2006 - 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Vol. 1, pp. 17-24)