Clustering semi-random mixtures of Gaussians

Pranjal Awasthi*, Aravindan Vijayaraghavan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Gaussian mixture models (GMM) are the most widely used statistical model for the fc-means clustering problem and form a popular framework for clustering in machinc learning and data analysis. In this paper, we propose a natural robust model for fc-means clustering that generalizes the Gaussian mixture model, and that we believe will be useful in identifying robust algorithms. Our first contribution is a polynomial time algorithm that provably recovers the ground-truth up to small classification error w.h.p., assuming certain separation between the components. Perhaps surprisingly, the algorithm we analyze is the popular Lloyd's algorithm for fc-means clustering that is the method-of-choice in practice. Our second result complements the upper bound by giving a nearly matching lower bound on the number of misclassified points incurred by any A:-means clustering algorithm on the semi-random model.

Original languageEnglish (US)
Title of host publication35th International Conference on Machine Learning, ICML 2018
EditorsAndreas Krause, Jennifer Dy
PublisherInternational Machine Learning Society (IMLS)
Number of pages26
ISBN (Electronic)9781510867963
StatePublished - 2018
Event35th International Conference on Machine Learning, ICML 2018 - Stockholm, Sweden
Duration: Jul 10 2018Jul 15 2018

Publication series

Name35th International Conference on Machine Learning, ICML 2018


Other35th International Conference on Machine Learning, ICML 2018

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Human-Computer Interaction
  • Software


Dive into the research topics of 'Clustering semi-random mixtures of Gaussians'. Together they form a unique fingerprint.

Cite this