Scalable Algorithms for MPI Intergroup Allgather and Allgatherv

Qiao Kang*, Jesper Larsson Träff, Reda Al-Bahrani, Ankit Agrawal, Alok Choudhary, Wei keng Liao

*Corresponding author for this work

Research output: Contribution to journalArticle

1 Scopus citations

Abstract

MPI intergroup collective communication defines message transfer patterns between two disjoint groups of MPI processes. Such patterns occur in coupled applications, and in modern scientific application workflows, mostly with large data sizes. However, current implementations in MPI production libraries adopt the “root gathering algorithm”, which does not achieve optimal communication transfer time. In this paper, we propose algorithms for the intergroup Allgather and Allgatherv communication operations under single-port communication constraints. We implement the new algorithms using MPI point-to-point and standard intra-communicator collective communication functions. We evaluate their performance on the Cori supercomputer at NERSC. Using message sizes per compute node ranging from 64KBytes to 8MBytes, our experiments show significant performance improvements of up to 23.67 times on 256 compute nodes compared with the implementations of production MPI libraries.

Original languageEnglish (US)
Pages (from-to)220-230
Number of pages11
JournalParallel Computing
Volume85
DOIs
StatePublished - Jul 2019

Keywords

  • All-to-all broadcast
  • Allgather
  • Allgatherv
  • Intergroup collective communication

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computer Networks and Communications
  • Computer Graphics and Computer-Aided Design
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'Scalable Algorithms for MPI Intergroup Allgather and Allgatherv'. Together they form a unique fingerprint.

  • Cite this