High performance multidimensional analysis of large datasets

Sanjay Goil, Alok Choudhary

Research output: Chapter in Book/Report/Conference proceedingConference contribution

17 Scopus citations

Abstract

Summary information from data in large databases is used to answer queries in On-Line Analytical Processing (OLAP) systems and to build decision support systems over them. The Data Cube is used to calculate and store summary information on a variety of dimensions, which is computed only partially if the number of dimensions is large. Queries posed on such systems are quite complex and require different views of data. These may either be answered from a materialized cube in the data cube or calculated on the fly. Further, data mining for associations can be performed on the data cube. Analytical models need to capture the multi-dimensionality of the underlying data, a task for which multidimensional databases are well suited. Multidimensional databases store data in multidimensional structure on which analytical operations are performed. A challenge for these systems is how to handle large data sets in a large number of dimensions. This paper presents q parallel OLAP infrastructure for multidimensional databases integrated with association rule mining. Scheduling optimizations for parallel computation of complete data cubes are presented. We propose left and right schedules for partial data cubes for m-way mining of association rules. Our implementation on the IBM SP-2, a shared-nothing parallel machine, can handle large data sets and a large number of dimensions by using disk I/O in our algorithms.

Original languageEnglish (US)
Title of host publicationProceedings of the 1st ACM International Workshop on Data Warehousing and OLAP, DOLAP 1998
PublisherAssociation for Computing Machinery
Pages34-39
Number of pages6
ISBN (Electronic)1581131208
DOIs
StatePublished - Nov 1 1998
Event1st ACM International Workshop on Data Warehousing and OLAP, DOLAP 1998 - Washington, United States
Duration: Nov 2 1998Nov 7 1998

Publication series

NameDOLAP: Proceedings of the ACM International Workshop on Data Warehousing and OLAP
VolumePart F129242

Other

Other1st ACM International Workshop on Data Warehousing and OLAP, DOLAP 1998
CountryUnited States
CityWashington
Period11/2/9811/7/98

ASJC Scopus subject areas

  • Computer Science(all)

Fingerprint Dive into the research topics of 'High performance multidimensional analysis of large datasets'. Together they form a unique fingerprint.

Cite this