Performance of Johnson–Lindenstrauss transform for k-means and k-medians clustering

Konstantin Makarychev, Yury Makarychev*, Ilya Razenshteyn

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

Consider an instance of Euclidean k-means or k-medians clustering. We show that the cost of the optimal solution is preserved up to a factor of (1 + ε) under a projection onto a random O(log(k/ε)/ε2)-dimensional subspace. Further, the cost of every clustering is preserved within (1 + ε). More generally, our result applies to any dimension reduction map satisfying a mild sub-Gaussian-tail condition. Our bound on the dimension is nearly optimal. Additionally, our result applies to Euclidean k-clustering with the distances raised to the p-th power for any constant p. For k-means, our result resolves an open problem posed by Cohen, Elder, Musco, Musco, and Persu (STOC 2015); for k-medians, it answers a question raised by Kannan.

Original languageEnglish (US)
Title of host publicationSTOC 2019 - Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing
EditorsMoses Charikar, Edith Cohen
PublisherAssociation for Computing Machinery
Pages1027-1038
Number of pages12
ISBN (Electronic)9781450367059
DOIs
StatePublished - Jun 23 2019
Event51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019 - Phoenix, United States
Duration: Jun 23 2019Jun 26 2019

Publication series

NameProceedings of the Annual ACM Symposium on Theory of Computing
ISSN (Print)0737-8017

Conference

Conference51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019
CountryUnited States
CityPhoenix
Period6/23/196/26/19

Keywords

  • Clustering
  • Dimension reduction
  • Johnson-Lindenstrauss transform
  • K-means
  • K-medians
  • Kirszbraun theorem

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'Performance of Johnson–Lindenstrauss transform for k-means and k-medians clustering'. Together they form a unique fingerprint.

  • Cite this

    Makarychev, K., Makarychev, Y., & Razenshteyn, I. (2019). Performance of Johnson–Lindenstrauss transform for k-means and k-medians clustering. In M. Charikar, & E. Cohen (Eds.), STOC 2019 - Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing (pp. 1027-1038). (Proceedings of the Annual ACM Symposium on Theory of Computing). Association for Computing Machinery. https://doi.org/10.1145/3313276.3316350