TY - GEN
T1 - Detailed analysis of I/O traces for large scale applications
AU - Nakka, N.
AU - Choudhary, A.
AU - Liao, W. K.
AU - Ward, L.
AU - Klundt, R.
AU - Weston, M. I.
N1 - Copyright:
Copyright 2010 Elsevier B.V., All rights reserved.
PY - 2009
Y1 - 2009
N2 - In this paper, we present a tool to extract I/O traces from very large applications running at full scale during their production runs. We analyze these traces to gain information about the application. We analyze the traces of three applications. The analysis showed that the I/O traces reveal much information about the application even without access to the source code. In particular, these I/O traces provide multiple indications towards the algorithmic nature of the application by observing the changes of data amount and I/O request distribution at the checkpoints. Adaptive Mesh Refinement (AMR) is one of the kind of algorithms that can exhibit such I/O behavior. This is the first study of I/O characteristics of unbalanced AMR-supported applications at scale. The key observations that we made in the trace were (1) Variation in aggregate data sizes across checkpoints for AMR and non-AMR applications, (2) Variation in the number of I/O calls by a client depending on the nature of the application, (3) Use of temporary files by applications and possible erroneous calls to I/O functions, (4) Variation in average data transfer size according as whether the application has AMR support or not, (5) Aggregation of I/O for processes executing on a single physical node through MPI-IO calls, and (6) Updates to specific data structures in the checkpoint file.
AB - In this paper, we present a tool to extract I/O traces from very large applications running at full scale during their production runs. We analyze these traces to gain information about the application. We analyze the traces of three applications. The analysis showed that the I/O traces reveal much information about the application even without access to the source code. In particular, these I/O traces provide multiple indications towards the algorithmic nature of the application by observing the changes of data amount and I/O request distribution at the checkpoints. Adaptive Mesh Refinement (AMR) is one of the kind of algorithms that can exhibit such I/O behavior. This is the first study of I/O characteristics of unbalanced AMR-supported applications at scale. The key observations that we made in the trace were (1) Variation in aggregate data sizes across checkpoints for AMR and non-AMR applications, (2) Variation in the number of I/O calls by a client depending on the nature of the application, (3) Use of temporary files by applications and possible erroneous calls to I/O functions, (4) Variation in average data transfer size according as whether the application has AMR support or not, (5) Aggregation of I/O for processes executing on a single physical node through MPI-IO calls, and (6) Updates to specific data structures in the checkpoint file.
KW - Adaptive mesh refinement
KW - I/O trace analysis
KW - Large scale I/O tracing
UR - http://www.scopus.com/inward/record.url?scp=77952212717&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77952212717&partnerID=8YFLogxK
U2 - 10.1109/HIPC.2009.5433186
DO - 10.1109/HIPC.2009.5433186
M3 - Conference contribution
AN - SCOPUS:77952212717
SN - 9781424449224
T3 - 16th International Conference on High Performance Computing, HiPC 2009 - Proceedings
SP - 419
EP - 427
BT - 16th International Conference on High Performance Computing, HiPC 2009 - Proceedings
T2 - 16th International Conference on High Performance Computing, HiPC 2009
Y2 - 16 December 2009 through 19 December 2009
ER -