TY - GEN
T1 - A scientific data management system for irregular applications
AU - No, Jaechun
AU - Thakur, R.
AU - Kaushik, D.
AU - Freitag, L.
AU - Choudhary, A.
N1 - Funding Information:
This work was supported in part by the Mathematical, Information, and Computational Sciences Division subprogram of the Office of Advanced Scientific Computing Research, U.S. Department of Energy, under Contract W-31-109-Eng-38, and in part by a Work-for-Others Subaward No. 751 with the University of Illinois, under NSF Cooperative Agreement #ACI-9619019.
Publisher Copyright:
© 2001 IEEE.
PY - 2001
Y1 - 2001
N2 - Many scientific applications are I/O intensive and generate large data sets, spanning hundreds or thousands of "files." Management, storage, efficient access, and analysis of this data present an extremely challenging task. We have developed a software system, called Scientific Data Manager (SDM), that uses a combination of parallel file I/O and database support for high-performance scientific data management. SDM provides a high-level API to the user and, internally, uses a parallel file system to store real data and a database to store application-related metadata. In this paper, we describe how we designed and implemented SDM to support irregular applications. SDM can efficiently handle the reading and writing of data in an irregular mesh, as well as the distribution of index values. We describe the SDM user interface and how we have implemented it to achieve high performance. SDM makes extensive use of MPI-IO's noncontiguous collective I/O functions. SDM also uses the concept of a history file to optimize the cost of the index distribution using the metadata stored in database. We present performance results with two irregular applications, a CFD code called FUN3D and a Rayleigh-Taylor instability code, on the SGI Origin2000 at Argonne National Laboratory.
AB - Many scientific applications are I/O intensive and generate large data sets, spanning hundreds or thousands of "files." Management, storage, efficient access, and analysis of this data present an extremely challenging task. We have developed a software system, called Scientific Data Manager (SDM), that uses a combination of parallel file I/O and database support for high-performance scientific data management. SDM provides a high-level API to the user and, internally, uses a parallel file system to store real data and a database to store application-related metadata. In this paper, we describe how we designed and implemented SDM to support irregular applications. SDM can efficiently handle the reading and writing of data in an irregular mesh, as well as the distribution of index values. We describe the SDM user interface and how we have implemented it to achieve high performance. SDM makes extensive use of MPI-IO's noncontiguous collective I/O functions. SDM also uses the concept of a history file to optimize the cost of the index distribution using the metadata stored in database. We present performance results with two irregular applications, a CFD code called FUN3D and a Rayleigh-Taylor instability code, on the SGI Origin2000 at Argonne National Laboratory.
UR - http://www.scopus.com/inward/record.url?scp=84981156270&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84981156270&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2001.925096
DO - 10.1109/IPDPS.2001.925096
M3 - Conference contribution
AN - SCOPUS:84981156270
T3 - Proceedings - 15th International Parallel and Distributed Processing Symposium, IPDPS 2001
SP - 1216
EP - 1223
BT - Proceedings - 15th International Parallel and Distributed Processing Symposium, IPDPS 2001
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 15th International Parallel and Distributed Processing Symposium, IPDPS 2001
Y2 - 23 April 2001 through 27 April 2001
ER -