Delegation-based I/O mechanism for high performance computing systems

Arifa Nisar*, Wei Keng Liao, Alok Choudhary

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

15 Scopus citations

Abstract

Massively parallel applications often require periodic data checkpointing for program restart and post-run data analysis. Although high performance computing systems provide massive parallelism and computing power to fulfill the crucial requirements of the scientific applications, the I/O tasks of high-end applications do not scale. Strict data consistency semantics adopted from traditional file systems are inadequate for homogeneous parallel computing platforms. For high performance parallel applications independent I/O is critical, particularly if checkpointing data are dynamically created or irregularly partitioned. In particular, parallel programs generating a large number of unrelated I/O accesses on large-scale systems often face serious I/O serializations introduced by lock contention and conflicts at file system layer. As these applications may not be able to utilize the I/O optimizations requiring process synchronization, they pose a great challenge for parallel I/O architecture and software designs. We propose an I/O mechanism to bridge the gap between scientific applications and parallel storage systems. A static file domain partitioning method is developed to align the I/O requests and produce a client-server mapping that minimizes the file lock acquisition costs and eliminates the lock contention. Our performance evaluations of production application I/O kernels demonstrate scalable performance and achieve high I/O bandwidths.

Original languageEnglish (US)
Article number6087361
Pages (from-to)271-279
Number of pages9
JournalIEEE Transactions on Parallel and Distributed Systems
Volume23
Issue number2
DOIs
StatePublished - 2012

Funding

This work was supported in part by NSF award numbers: CCF-0621443, SDCI OCI-0724599, CNS-0551639, IIS-0536994, and HECURA-0938000. This work was also partially supported by DOE grants DE-FC02-07ER25808, DE-FG02-08ER25848, DE-SC0005309, and DE-SC0005340. This research used resources of the National Energy Research Scientific Computing Center (NERSC), which is supported by the Office of Science of the US Department of Energy under Contract No. DE-AC02-05CH11231.

Keywords

  • I/O delegation
  • MPI-IO
  • Parallel I/O
  • collaborative caching
  • file locking
  • non collective I/O
  • parallel file systems

ASJC Scopus subject areas

  • Signal Processing
  • Hardware and Architecture
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Delegation-based I/O mechanism for high performance computing systems'. Together they form a unique fingerprint.

Cite this