h5bench: A unified benchmark suite for evaluating HDF5 I/O performance on pre-exascale platforms

Jean Luca Bez*, Houjun Tang, Scot Breitenfeld, Huihuo Zheng, Wei Keng Liao, Kaiyuan Hou, Zanhua Huang, Suren Byna

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Parallel I/O is a critical technique for moving data between compute and storage subsystems of supercomputers. With massive amounts of data produced or consumed by compute nodes, high-performant parallel I/O is essential. I/O benchmarks play an important role in this process; however, there is a scarcity of I/O benchmarks representative of current workloads on HPC systems. Toward creating representative I/O kernels from real-world applications, we have created h5bench, a set of I/O kernels that exercise hierarchical data format version 5 (HDF5) I/O on parallel file systems in numerous dimensions. Our focus on HDF5 is due to the parallel I/O library's heavy usage in various scientific applications running on supercomputing systems. The various tests benchmarked in the h5bench suite include I/O operations (read and write), data locality (arrays of basic data types and arrays of structures), array dimensionality (one-dimensional arrays, two-dimensional meshes, three-dimensional cubes), I/O modes (synchronous and asynchronous). In this paper, we present the observed performance of h5bench executed along several of these dimensions on existing supercomputers (Cori and Summit) and pre-exascale platforms (Perlmutter, Theta, and Polaris). h5bench measurements can be used to identify performance bottlenecks and their root causes and evaluate I/O optimizations. As the I/O patterns of h5bench are diverse and capture the I/O behaviors of various HPC applications, this study will be helpful to the broader supercomputing and I/O community.

Original languageEnglish (US)
Article numbere8046
JournalConcurrency Computation Practice and Experience
Volume36
Issue number16
DOIs
StateAccepted/In press - 2024

Keywords

  • HDF5
  • I/O access patterns
  • I/O benchmarks
  • I/O performance

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Computer Science Applications
  • Computer Networks and Communications
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'h5bench: A unified benchmark suite for evaluating HDF5 I/O performance on pre-exascale platforms'. Together they form a unique fingerprint.

Cite this