### Abstract

Clustering is a common problem in the analysis of large data sets. Streaming algorithms, which make a single pass over the data set using small working memory and produce a clustering comparable in cost to the optimal offline solution, are especially useful. We develop the first streaming algorithms achieving a constant-factor approximation to the cluster radius for two variations of the k-center clustering problem. We give a streaming (4 + ε)-approximation algorithm using O(ε^{-1} kz) memory for the problem with outliers, in which the clustering is allowed to drop up to z of the input points; previous work used a random sampling approach which yields only a bicriteria approximation. We also give a streaming (6 + ε)-approximation algorithm using O(ε^{-1} ln (ε^{-1}) k + k ^{2}) memory for a variation motivated by anonymity considerations in which each cluster must contain at least a certain number of input points.

Original language | English (US) |
---|---|

Title of host publication | Approximation, Randomization and Combinatorial Optimization |

Subtitle of host publication | Algorithms and Techniques - 11th International Workshop, APPROX 2008 and 12th International Workshop, RANDOM 2008, Proceedings |

Pages | 165-178 |

Number of pages | 14 |

DOIs | |

State | Published - Sep 22 2008 |

Event | 11th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2008 and 12th International Workshop on Randomization and Computation, RANDOM 2008 - Boston, MA, United States Duration: Aug 25 2008 → Aug 27 2008 |

### Publication series

Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|

Volume | 5171 LNCS |

ISSN (Print) | 0302-9743 |

ISSN (Electronic) | 1611-3349 |

### Conference

Conference | 11th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2008 and 12th International Workshop on Randomization and Computation, RANDOM 2008 |
---|---|

Country | United States |

City | Boston, MA |

Period | 8/25/08 → 8/27/08 |

### Keywords

- Anonymity
- Clustering
- Outliers
- Streaming
- k-center

### ASJC Scopus subject areas

- Theoretical Computer Science
- Computer Science(all)

## Fingerprint Dive into the research topics of 'Streaming algorithms for k-center clustering with outliers and with anonymity'. Together they form a unique fingerprint.

## Cite this

*Approximation, Randomization and Combinatorial Optimization: Algorithms and Techniques - 11th International Workshop, APPROX 2008 and 12th International Workshop, RANDOM 2008, Proceedings*(pp. 165-178). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5171 LNCS). https://doi.org/10.1007/978-3-540-85363-3_14