Two knives cut better than one: Chinese word segmentation with dual decomposition

Mengqiu Wang, Rob Voigt, Christopher D. Manning

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Scopus citations

Abstract

There are two dominant approaches to Chinese word segmentation: word-based and character-based models, each with respective strengths. Prior work has shown that gains in segmentation performance can be achieved from combining these two types of models; however, past efforts have not provided a practical technique to allow mainstream adoption. We propose a method that effectively combines the strength of both segmentation schemes using an efficient dual-decomposition algorithm for joint inference. Our method is simple and easy to implement. Experiments on SIGHAN 2003 and 2005 evaluation datasets show that our method achieves the best reported results to date on 6 out of 7 datasets.

Original languageEnglish (US)
Title of host publicationLong Papers
PublisherAssociation for Computational Linguistics (ACL)
Pages193-198
Number of pages6
ISBN (Print)9781937284732
DOIs
StatePublished - Jan 1 2014
Event52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014 - Baltimore, MD, United States
Duration: Jun 22 2014Jun 27 2014

Publication series

Name52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014 - Proceedings of the Conference
Volume2

Other

Other52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014
CountryUnited States
CityBaltimore, MD
Period6/22/146/27/14

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Two knives cut better than one: Chinese word segmentation with dual decomposition'. Together they form a unique fingerprint.

Cite this