Full-duplex inter-group all-to-all broadcast algorithms with optimal bandwidth

Qiao Kang, Ankit Agrawal, Jesper Larsson Träff, Alok Choudhary, Reda Al-Bahrani, Liao Weikeng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

MPI inter-group collective communication patterns can be viewed as bipartite graphs that divide processes into two disjoint groups in which messages are transferred between but not within the groups. Such communication patterns can serve as basic operations for scientific application workflows. In this paper, we present parallel algorithms for inter-group all-to-all broadcast (Allgather) communication with optimal bandwidth for any message size and process number under single-port communication constraints. We implement the algorithms using MPI point-to-point and intra-group collective communication functions and evaluate their performance on the Cori supercomputer at NERSC. Using message sizes ranging from 256B to 64MB, the experiments show a significant performance improvement achieved by our algorithm, which is up to 9.27 times faster than production MPI libraries that adopt the so called root-gathering algorithm.

Original languageEnglish (US)
Title of host publicationEuroMPI 2018 - Proceedings of the 25th European MPI Users' Group Meeting
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450364928
DOIs
StatePublished - Sep 23 2018
Event25th European MPI Users' Group Meeting, EuroMPI 2018 - Barcelona, Spain
Duration: Sep 23 2018Sep 26 2018

Publication series

NameACM International Conference Proceeding Series

Other

Other25th European MPI Users' Group Meeting, EuroMPI 2018
Country/TerritorySpain
CityBarcelona
Period9/23/189/26/18

Keywords

  • All-to-all broadcast
  • Allgather
  • Inter-group communication

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Full-duplex inter-group all-to-all broadcast algorithms with optimal bandwidth'. Together they form a unique fingerprint.

Cite this