Efficient, certifiably optimal clustering with applications to latent variable graphical models

Carson Eisenach*, Han Liu

*Corresponding author for this work

Research output: Contribution to journalArticle

Abstract

Motivated by the task of clustering either d variables or d points into K groups, we investigate efficient algorithms to solve the Peng–Wei (P–W) K-means semi-definite programming (SDP) relaxation. The P–W SDP has been shown in the literature to have good statistical properties in a variety of settings, but remains intractable to solve in practice. To this end we propose FORCE, a new algorithm to solve this SDP relaxation. Compared to off-the-shelf interior point solvers, our method reduces the computational complexity of solving the SDP from O~ (d7log ϵ- 1) to O~ (d6K- 2ϵ- 1) arithmetic operations for an ϵ-optimal solution. Our method combines a primal first-order method with a dual optimality certificate search, which when successful, allows for early termination of the primal method. We show for certain variable clustering problems that, with high probability, FORCE is guaranteed to find the optimal solution to the SDP relaxation and provide a certificate of exact optimality. As verified by our numerical experiments, this allows FORCE to solve the P–W SDP with dimensions in the hundreds in only tens of seconds. For a variation of the P–W SDP where K is not known a priori a slight modification of FORCE reduces the computational complexity of solving this problem as well: from O~ (d7log ϵ- 1) using a standard SDP solver to O~ (d4ϵ- 1).

Original languageEnglish (US)
Pages (from-to)137-173
Number of pages37
JournalMathematical Programming
Volume176
Issue number1-2
DOIs
StatePublished - Jul 1 2019

Fingerprint

Latent Variable Models
Semidefinite Programming
Graphical Models
Computational complexity
Semidefinite Programming Relaxation
Clustering
Certificate
Optimality
Computational Complexity
Optimal Solution
Early Termination
K-group
K-means
Interior Point
Experiments
Statistical property
Efficient Algorithms
Numerical Experiment
First-order

ASJC Scopus subject areas

  • Software
  • Mathematics(all)

Cite this

@article{a50ce51dba2c4e9289a265e3f452349d,
title = "Efficient, certifiably optimal clustering with applications to latent variable graphical models",
abstract = "Motivated by the task of clustering either d variables or d points into K groups, we investigate efficient algorithms to solve the Peng–Wei (P–W) K-means semi-definite programming (SDP) relaxation. The P–W SDP has been shown in the literature to have good statistical properties in a variety of settings, but remains intractable to solve in practice. To this end we propose FORCE, a new algorithm to solve this SDP relaxation. Compared to off-the-shelf interior point solvers, our method reduces the computational complexity of solving the SDP from O~ (d7log ϵ- 1) to O~ (d6K- 2ϵ- 1) arithmetic operations for an ϵ-optimal solution. Our method combines a primal first-order method with a dual optimality certificate search, which when successful, allows for early termination of the primal method. We show for certain variable clustering problems that, with high probability, FORCE is guaranteed to find the optimal solution to the SDP relaxation and provide a certificate of exact optimality. As verified by our numerical experiments, this allows FORCE to solve the P–W SDP with dimensions in the hundreds in only tens of seconds. For a variation of the P–W SDP where K is not known a priori a slight modification of FORCE reduces the computational complexity of solving this problem as well: from O~ (d7log ϵ- 1) using a standard SDP solver to O~ (d4ϵ- 1).",
author = "Carson Eisenach and Han Liu",
year = "2019",
month = "7",
day = "1",
doi = "10.1007/s10107-019-01375-2",
language = "English (US)",
volume = "176",
pages = "137--173",
journal = "Mathematical Programming",
issn = "0025-5610",
publisher = "Springer-Verlag GmbH and Co. KG",
number = "1-2",

}

Efficient, certifiably optimal clustering with applications to latent variable graphical models. / Eisenach, Carson; Liu, Han.

In: Mathematical Programming, Vol. 176, No. 1-2, 01.07.2019, p. 137-173.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Efficient, certifiably optimal clustering with applications to latent variable graphical models

AU - Eisenach, Carson

AU - Liu, Han

PY - 2019/7/1

Y1 - 2019/7/1

N2 - Motivated by the task of clustering either d variables or d points into K groups, we investigate efficient algorithms to solve the Peng–Wei (P–W) K-means semi-definite programming (SDP) relaxation. The P–W SDP has been shown in the literature to have good statistical properties in a variety of settings, but remains intractable to solve in practice. To this end we propose FORCE, a new algorithm to solve this SDP relaxation. Compared to off-the-shelf interior point solvers, our method reduces the computational complexity of solving the SDP from O~ (d7log ϵ- 1) to O~ (d6K- 2ϵ- 1) arithmetic operations for an ϵ-optimal solution. Our method combines a primal first-order method with a dual optimality certificate search, which when successful, allows for early termination of the primal method. We show for certain variable clustering problems that, with high probability, FORCE is guaranteed to find the optimal solution to the SDP relaxation and provide a certificate of exact optimality. As verified by our numerical experiments, this allows FORCE to solve the P–W SDP with dimensions in the hundreds in only tens of seconds. For a variation of the P–W SDP where K is not known a priori a slight modification of FORCE reduces the computational complexity of solving this problem as well: from O~ (d7log ϵ- 1) using a standard SDP solver to O~ (d4ϵ- 1).

AB - Motivated by the task of clustering either d variables or d points into K groups, we investigate efficient algorithms to solve the Peng–Wei (P–W) K-means semi-definite programming (SDP) relaxation. The P–W SDP has been shown in the literature to have good statistical properties in a variety of settings, but remains intractable to solve in practice. To this end we propose FORCE, a new algorithm to solve this SDP relaxation. Compared to off-the-shelf interior point solvers, our method reduces the computational complexity of solving the SDP from O~ (d7log ϵ- 1) to O~ (d6K- 2ϵ- 1) arithmetic operations for an ϵ-optimal solution. Our method combines a primal first-order method with a dual optimality certificate search, which when successful, allows for early termination of the primal method. We show for certain variable clustering problems that, with high probability, FORCE is guaranteed to find the optimal solution to the SDP relaxation and provide a certificate of exact optimality. As verified by our numerical experiments, this allows FORCE to solve the P–W SDP with dimensions in the hundreds in only tens of seconds. For a variation of the P–W SDP where K is not known a priori a slight modification of FORCE reduces the computational complexity of solving this problem as well: from O~ (d7log ϵ- 1) using a standard SDP solver to O~ (d4ϵ- 1).

UR - http://www.scopus.com/inward/record.url?scp=85061924817&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85061924817&partnerID=8YFLogxK

U2 - 10.1007/s10107-019-01375-2

DO - 10.1007/s10107-019-01375-2

M3 - Article

VL - 176

SP - 137

EP - 173

JO - Mathematical Programming

JF - Mathematical Programming

SN - 0025-5610

IS - 1-2

ER -