TY - GEN
T1 - Asynchronous I/O Strategy for Large-Scale Deep Learning Applications
AU - Lee, Sunwoo
AU - Kang, Qiao
AU - Wang, Kewei
AU - Balewski, Jan
AU - Sim, Alex
AU - Agrawal, Ankit
AU - Choudhary, Alok
AU - Nugent, Peter
AU - Wu, Kesheng
AU - Liao, Wei Keng
N1 - Funding Information:
This material is based upon work supported in part by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Scientific Discovery through Advanced Computing (SciDAC) program award numbers DE-SC0021399 and DE-SC0019358. This work was supported by the Office of Advanced Scientific Computing Research, Office of Science, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231, and also used resources of the National Energy Research Scientific Computing Center. This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak
Funding Information:
Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Many scientific applications have started using deep learning methods for their classification or regression problems. However, for data-intensive scientific applications, I/O performance can be the major performance bottleneck. In order to effectively solve important real-world problems using deep learning methods on High-Performance Computing (HPC) systems, it is essential to address the poor I/O performance issue in large-scale neural network training. In this paper, we propose an asynchronous I/O strategy that can be generally applied to deep learning applications. Our I/O strategy employs an I/O -dedicated thread per process, that performs I/O operations independently of the training progress. The I/O thread reads many training samples at once to reduce the total number of I/O operations per epoch. Given the fixed amount of training data, the fewer the I/O operations per epoch, the shorter the overall I/O time. The I/O operations are also overlapped with the computations using the double-buffering method. We evaluate our I/O strategy using two real-world scientific applications, CosmoFlow and Neuron-Inverter. Our experimental results demonstrate that the proposed I/O strategy significantly improves the scaling performance without affecting the regression performance.
AB - Many scientific applications have started using deep learning methods for their classification or regression problems. However, for data-intensive scientific applications, I/O performance can be the major performance bottleneck. In order to effectively solve important real-world problems using deep learning methods on High-Performance Computing (HPC) systems, it is essential to address the poor I/O performance issue in large-scale neural network training. In this paper, we propose an asynchronous I/O strategy that can be generally applied to deep learning applications. Our I/O strategy employs an I/O -dedicated thread per process, that performs I/O operations independently of the training progress. The I/O thread reads many training samples at once to reduce the total number of I/O operations per epoch. Given the fixed amount of training data, the fewer the I/O operations per epoch, the shorter the overall I/O time. The I/O operations are also overlapped with the computations using the double-buffering method. We evaluate our I/O strategy using two real-world scientific applications, CosmoFlow and Neuron-Inverter. Our experimental results demonstrate that the proposed I/O strategy significantly improves the scaling performance without affecting the regression performance.
KW - Deep Learning
KW - I/O
KW - Parallelization
UR - http://www.scopus.com/inward/record.url?scp=85125637530&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85125637530&partnerID=8YFLogxK
U2 - 10.1109/HiPC53243.2021.00046
DO - 10.1109/HiPC53243.2021.00046
M3 - Conference contribution
AN - SCOPUS:85125637530
T3 - Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021
SP - 322
EP - 331
BT - Proceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 28th IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2021
Y2 - 17 December 2021 through 18 December 2021
ER -