A probabilistic model of meetings that combines words and discourse features

Mike Dowman*, Virginia Savova, Thomas L. Griffiths, Konrad P. Körding, Joshua B. Tenenbaum, Matthew Purver

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

17 Scopus citations

Abstract

In order to determine the points at which meeting discourse changes from one topic to another, probabilistic models were used to approximate the process through which meeting transcripts were produced. Gibbs sampling was used to estimate the values of random variables in the models, including the locations of topic boundaries. This paper shows how discourse features were integrated into the Bayesian model and reports empirical evaluations of the benefit obtained through the inclusion of each feature and of the suitability of alternative models of the placement of topic boundaries. It demonstrates howmultiple cues to segmentation can be combined in a principled way, and empirical tests show a clear improvement over previous work.

Original languageEnglish (US)
Pages (from-to)1238-1248
Number of pages11
JournalIEEE Transactions on Audio, Speech and Language Processing
Volume16
Issue number7
DOIs
StatePublished - Sep 2008

Funding

Manuscript received June 18, 2007; revised April 30, 2008. This work was supported by the CALO project (DARPA Grant NBCH-D-03-0010) and the work of M. Dowman was supported by a Japan Society for the Promotion of Science postdoctoral fellowship. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Mark Johnson. M. Dowman is with the Department of General System Studies, University of Tokyo, Tokyo 153-8902, Japan (e-mail: [email protected]). V. Savova and J. B. Tenenbaum are with the Department of Brain and Cognitive, Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139 USA (e-mail: [email protected]; [email protected]). T. L. Griffiths is with the Department of Psychology, University of California at Berkeley, Berkeley, CA 94720 USA (e-mail: [email protected]). K. P. Körding is with Physical Medicine and Rehabilitation, Northwestern University, Chicago, IL 60611 USA (e-mail: [email protected]). M. Purver is with Center for the Study of Language and Information, Stanford University, Stanford CA 94305 USA ( e-mail: [email protected]). Digital Object Identifier 10.1109/TASL.2008.925867

Keywords

  • Gibbs sampling
  • Hierarchical bayesian models
  • Latent dirichlet allocation
  • Markov chain monte carlo
  • Topical segmentation

ASJC Scopus subject areas

  • Acoustics and Ultrasonics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A probabilistic model of meetings that combines words and discourse features'. Together they form a unique fingerprint.

Cite this