TY - GEN
T1 - Clustering semi-random mixtures of Gaussians
AU - Awasthi, Pranjal
AU - Vijayaraghavan, Aravindan
N1 - Publisher Copyright:
© Copyright 2018 by the Authors. All rights reserved.
PY - 2018
Y1 - 2018
N2 - Gaussian mixture models (GMM) are the most widely used statistical model for the fc-means clustering problem and form a popular framework for clustering in machinc learning and data analysis. In this paper, we propose a natural robust model for fc-means clustering that generalizes the Gaussian mixture model, and that we believe will be useful in identifying robust algorithms. Our first contribution is a polynomial time algorithm that provably recovers the ground-truth up to small classification error w.h.p., assuming certain separation between the components. Perhaps surprisingly, the algorithm we analyze is the popular Lloyd's algorithm for fc-means clustering that is the method-of-choice in practice. Our second result complements the upper bound by giving a nearly matching lower bound on the number of misclassified points incurred by any A:-means clustering algorithm on the semi-random model.
AB - Gaussian mixture models (GMM) are the most widely used statistical model for the fc-means clustering problem and form a popular framework for clustering in machinc learning and data analysis. In this paper, we propose a natural robust model for fc-means clustering that generalizes the Gaussian mixture model, and that we believe will be useful in identifying robust algorithms. Our first contribution is a polynomial time algorithm that provably recovers the ground-truth up to small classification error w.h.p., assuming certain separation between the components. Perhaps surprisingly, the algorithm we analyze is the popular Lloyd's algorithm for fc-means clustering that is the method-of-choice in practice. Our second result complements the upper bound by giving a nearly matching lower bound on the number of misclassified points incurred by any A:-means clustering algorithm on the semi-random model.
UR - http://www.scopus.com/inward/record.url?scp=85057246073&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85057246073&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85057246073
T3 - 35th International Conference on Machine Learning, ICML 2018
SP - 469
EP - 494
BT - 35th International Conference on Machine Learning, ICML 2018
A2 - Krause, Andreas
A2 - Dy, Jennifer
PB - International Machine Learning Society (IMLS)
T2 - 35th International Conference on Machine Learning, ICML 2018
Y2 - 10 July 2018 through 15 July 2018
ER -