### Abstract

Original language | English (US) |
---|---|

State | Published - 2016 |

### Cite this

}

**On the bias and inconsistency of K-means clustering.** / Jin, Chen; Malthouse, Edward Carl.

Research output: Book/Report › Other report

TY - BOOK

T1 - On the bias and inconsistency of K-means clustering

AU - Jin, Chen

AU - Malthouse, Edward Carl

PY - 2016

Y1 - 2016

N2 - We provide a counterexample showing that the K-means clustering algorithm using hard assignments produces biased and inconsistent estimates of the cluster means and variances. We discuss how a Gaussian mixture model that assumes spherical clusters with equal shape and size, and makes soft assignments to clusters produces consistent estimates from good starting values, and has computational complexity comparable to K-means. We recommend that the Gaussian mixture model be used instead of K-means.

AB - We provide a counterexample showing that the K-means clustering algorithm using hard assignments produces biased and inconsistent estimates of the cluster means and variances. We discuss how a Gaussian mixture model that assumes spherical clusters with equal shape and size, and makes soft assignments to clusters produces consistent estimates from good starting values, and has computational complexity comparable to K-means. We recommend that the Gaussian mixture model be used instead of K-means.

M3 - Other report

BT - On the bias and inconsistency of K-means clustering

ER -