Query Caching and Optimization in Distributed Mediator Systems

S. Adali, K. S. Candan, Y. Papakonstantinou, V. S. Subrahmanian

Research output: Contribution to journalArticlepeer-review

227 Scopus citations

Abstract

Query processing and optimization in mediator systems that access distributed non-proprietary sources pose many novel problems. Cost-based query optimization is hard because the mediator does not have access to source statistics information and furthermore it may not be easy to model the source's performance. At the same time, querying remote sources may be very expensive because of high connection overhead, long computation time, financial charges, and temporary unavailability. We propose a cost-based optimization technique that caches statistics of actual calls to the sources and consequently estimates the cost of the possible execution plans based on the statistics cache. We investigate issues pertaining to the design of the statistics cache and experimentally analyze various tradeoffs. We also present a query result caching mechanism that allows us to effectively use results of prior queries when the source is not readily available. We employ the novel invariants mechanism, which shows how semantic information about data sources may be used to discover cached query results of interest.

Original languageEnglish (US)
Pages (from-to)137-148
Number of pages12
JournalSIGMOD Record
Volume25
Issue number2
DOIs
StatePublished - Jun 1996
Externally publishedYes

ASJC Scopus subject areas

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Query Caching and Optimization in Distributed Mediator Systems'. Together they form a unique fingerprint.

Cite this