Parallel Deep Convolutional Neural Network Training by Exploiting the Overlapping of Computation and Communication

Sunwoo Lee, Dipendra Jha, Ankit Agrawal, Alok Choudhary, Wei Keng Liao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Training Convolutional Neural Network (CNN) is a computationally intensive task whose parallelization has become critical in order to complete the training in an acceptable time. However, there are two obstacles to developing a scalable parallel CNN in a distributed-memory computing environment. One is the high degree of data dependency exhibited in the model parameters across every two adjacent minibatches and the other is the large amount of data to be transferred across the communication channel. In this paper, we present a parallelization strategy that maximizes the overlap of inter-process communication with the computation. The overlapping is achieved by using a thread per compute node to initiate communication after the gradients are available. The output data of backpropagation stage is generated at each model layer, and the communication for the data can run concurrently with the computation of other layers. To study the effectiveness of the overlapping and its impact on the scalability, we evaluated various model architectures and hyperparameter settings. When training VGG-A model using ImageNet data sets, we achieve speedups of 62.97× and 77.97× on 128 compute nodes using mini-batch sizes of 256 and 512, respectively.

Original languageEnglish (US)
Title of host publicationProceedings - 24th IEEE International Conference on High Performance Computing, HiPC 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages183-192
Number of pages10
ISBN (Electronic)9781538622933
DOIs
StatePublished - Feb 7 2018
Event24th IEEE International Conference on High Performance Computing, HiPC 2017 - Jaipur, India
Duration: Dec 18 2017Dec 21 2017

Publication series

NameProceedings - 24th IEEE International Conference on High Performance Computing, HiPC 2017
Volume2017-December

Other

Other24th IEEE International Conference on High Performance Computing, HiPC 2017
CountryIndia
CityJaipur
Period12/18/1712/21/17

    Fingerprint

Keywords

  • Communication
  • Convolutional Neural Network
  • Deep Learning
  • Overlapping
  • Parallelization

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Hardware and Architecture
  • Modeling and Simulation

Cite this

Lee, S., Jha, D., Agrawal, A., Choudhary, A., & Liao, W. K. (2018). Parallel Deep Convolutional Neural Network Training by Exploiting the Overlapping of Computation and Communication. In Proceedings - 24th IEEE International Conference on High Performance Computing, HiPC 2017 (pp. 183-192). (Proceedings - 24th IEEE International Conference on High Performance Computing, HiPC 2017; Vol. 2017-December). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/HiPC.2017.00030