Message assembly and disassembly represent a significant fraction of total communication time in many parallel systems. We introduce a run-time approach for fast message assembly and disassembly. The approach is based on generating addresses by decoding a precomputed and compactly stored address relation that describes the mapping of addresses on the source node to addresses on the destination node. The main result is that relations induced by redistributions of regular block-cyclic distributed arrays can be encoded in an extremely compact form that facilitates high throughput message assembly and disassembly. We measure the throughput of decoding-based message assembly and disassembly on several systems and find performance on par with copy throughput.
ASJC Scopus subject areas
- Hardware and Architecture
- Computer Networks and Communications