PXML: A probabilistic semistructured data model and algebra

Edward Hung*, Lise Getoor, V. S. Subrahmanian

*Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

104 Scopus citations

Abstract

Despite the recent proliferation of work on semistructured data models, there has been little work to date on supporting uncertainty in these models. In this paper, we propose a model for probabilistic semistructured data (PSD). The advantage of our approach is that it supports a flexible representation that allows the specification of a wide class of distributions over semistructured instances. We provide two semantics for the model and show that the semantics are probabilistically coherent. Next, we develop an extension of the relational algebra to handle probabilistic semistructured data and describe efficient algorithms for answering queries that use this algebra. Finally, we present experimental results showing the efficiency of our algorithms.

Original languageEnglish (US)
Pages467-478
Number of pages12
DOIs
StatePublished - 2003
Externally publishedYes
EventNineteenth International Conference on Data Ingineering - Bangalore, India
Duration: Mar 5 2003Mar 8 2003

Conference

ConferenceNineteenth International Conference on Data Ingineering
Country/TerritoryIndia
CityBangalore
Period3/5/033/8/03

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Fingerprint

Dive into the research topics of 'PXML: A probabilistic semistructured data model and algebra'. Together they form a unique fingerprint.

Cite this