Minimizing communication cost in distributed multi-query processing

Jian Li*, Amol Deshpande, Samir Khuller

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

25 Scopus citations

Abstract

Increasing prevalence of large-scale distributed monitoring and computing environments such as sensor networks, scientific federations, Grids etc., has led to a renewed interest in the area of distributed query processing and optimization. In this paper we address a general, distributed multiquery processing problem motivated by the need to minimize the communication cost in these environments. Specifically we address the problem of optimally sharing data movement across the communication edges in a distributed communication network given a set of overlapping queries and query plans for them (specifying the operations to be executed). Most of the problem variations of our general problem can be shown to be NP-Hard by a reduction from the Steiner tree problem. However, we show that the problem can be solved optimally if the communication network is a tree, and present a novel algorithm for finding an optimal data movement plan. For general communication networks, we present efficient approximation algorithms for several variations of the problem. Finally, we present an experimental study over synthetic datasets showing both the need for exploiting the sharing of data movement and the effectiveness of our algorithms at finding such plans.

Original languageEnglish (US)
Title of host publicationProceedings - 25th IEEE International Conference on Data Engineering, ICDE 2009
Pages772-783
Number of pages12
DOIs
StatePublished - Jul 8 2009
Event25th IEEE International Conference on Data Engineering, ICDE 2009 - Shanghai, China
Duration: Mar 29 2009Apr 2 2009

Publication series

NameProceedings - International Conference on Data Engineering
ISSN (Print)1084-4627

Conference

Conference25th IEEE International Conference on Data Engineering, ICDE 2009
CountryChina
CityShanghai
Period3/29/094/2/09

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Fingerprint Dive into the research topics of 'Minimizing communication cost in distributed multi-query processing'. Together they form a unique fingerprint.

  • Cite this

    Li, J., Deshpande, A., & Khuller, S. (2009). Minimizing communication cost in distributed multi-query processing. In Proceedings - 25th IEEE International Conference on Data Engineering, ICDE 2009 (pp. 772-783). [4812453] (Proceedings - International Conference on Data Engineering). https://doi.org/10.1109/ICDE.2009.85