Aggregate operators in probabilistic databases

Robert Ross*, V. S. Subrahmanian, John Grant

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

68 Scopus citations

Abstract

Though extensions to the relational data model have been proposed in order to handle probabilistic information, there has been very little work to date on handling aggregate operators in such databases. In this article, we present a very general notion of an aggregate operator and show how classical aggregation operators (such as COUNT, SUM, etc.) as well as statistical operators (such as percentiles, variance, etc.) are special cases of this general definition. We devise a formal linear programming based semantics for computing aggregates over probabilistic DBMSs, develop algorithms that satisfy this semantics, analyze their complexity, and introduce several families of approximation algorithms that run in polynomial time. We implemented all of these algorithms and tested them on a large set of data to help determine when each one is preferable.

Original languageEnglish (US)
Pages (from-to)54-101
Number of pages48
JournalJournal of the ACM
Volume52
Issue number1
DOIs
StatePublished - 2005
Externally publishedYes

Keywords

  • Aggregates
  • Probabilistic relational databases

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Information Systems
  • Hardware and Architecture
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Aggregate operators in probabilistic databases'. Together they form a unique fingerprint.

Cite this