ProbView: A Flexible Probabilistic Database System

Laks V.S. Lakshmanan*, Nicola Leone, Robert Ross, V. S. Subrahmanian

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

236 Scopus citations

Abstract

Probability theory is mathematically the best understood paradigm for modeling and manipulating uncertain information. Probabilities of complex events can be computed from those of basic events on which they depend, using any of a number of strategies. Which strategy is appropriate depends very much on the known interdependencies among the events involved. Previous work on probabilistic databases has assumed a fixed and restrictive combination strategy (e.g., assuming all events are pairwise independent). In this article, we characterize, using postulates, whole classes of strategies for conjunction, disjunction, and negation, meaningful from the viewpoint of probability theory. (1) We propose a probabilistic relational data model and a generic probabilistic relational algebra that neatly captures various strategies satisfying the postulates, within a single unified framework. (2) We show that as long as the chosen strategies can be computed in polynomial time, queries in the positive fragment of the probabilistic relational algebra have essentially the same data complexity as classical relational algebra. (3) We establish various containments and equivalences between algebraic expressions, similar in spirit to those in classical algebra. (4) We develop algorithms for maintaining materialized probabilistic views. (5) Based on these ideas, we have developed a prototype probabilistic database system called ProbView on top of Dbase V.0. We validate our complexity results with experiments and show that rewriting certain types of queries to other equivalent forms often yields substantial savings.

Original languageEnglish (US)
Pages (from-to)419-469
Number of pages51
JournalACM Transactions on Database Systems
Volume22
Issue number3
DOIs
StatePublished - Sep 1997
Externally publishedYes

Keywords

  • Algebra
  • Data complexity
  • H.2.1 [Database Management]: Logical Design-data models
  • H.2.3 [Database Management]: Languages-query languages
  • H.2.4 [Database Management]: Systems
  • Performance evaluation
  • Probabilistic databases
  • View maintenance

ASJC Scopus subject areas

  • Information Systems

Fingerprint

Dive into the research topics of 'ProbView: A Flexible Probabilistic Database System'. Together they form a unique fingerprint.

Cite this