Mapping communication layouts to network hardware characteristics on massive-scale blue gene systems

Pavan Balaji*, Rinku Gupta, Abhinav Vishnu, Pete Beckman

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

24 Scopus citations

Abstract

For parallel applications running on high-end computing systems, which processes of an application get launched on which processing cores is typically determined at application launch time without any information about the application characteristics. As high-end computing systems continue to grow in scale, however, this approach is becoming increasingly infeasible for achieving the best performance. For example, for systems such as IBM Blue Gene and Cray XT that rely on flat 3D torus networks, process communication often involves network sharing, even for highly scalable applications. This causes the overall application performance to depend heavily on how processes are mapped on the network. In this paper, we first analyze the impact of different process mappings on application performance on a massive Blue Gene/P system. Then, we match this analysis with application communication patterns that we allow applications to describe prior to being launched. The underlying process management system can use this combined information in conjunction with the hardware characteristics of the system to determine the best mapping for the application. Our experiments study the performance of different communication patterns, including 2D and 3D nearest-neighbor communication and structured Cartesian grid communication. Our studies, that scale up to 131,072 cores of the largest BG/P system in the United States (using 80% of the total system size), demonstrate that different process mappings can show significant difference in overall performance, especially on scale. For example, we show that this difference can be as much as 30% for P3DFFT and up to twofold for HALO. Through our proposed model, however, such differences in performance can be avoided so that the best possible performance is always achieved.

Original languageEnglish (US)
Pages (from-to)247-256
Number of pages10
JournalComputer Science - Research and Development
Volume26
Issue number3-4
DOIs
StatePublished - Jun 2011

Keywords

  • Blue gene
  • Process mapping
  • Torus networks

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Mapping communication layouts to network hardware characteristics on massive-scale blue gene systems'. Together they form a unique fingerprint.

Cite this