Direct Estimation of Weights and Efficient Training of Deep Neural Networks without SGD

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We argue that learning a hierarchy of features in a hierarchical dataset requires lower layers to approach convergence faster than layers above them. We show that, if this assumption holds, we can analytically approximate the outcome of stochastic gradient descent (SGD) for each layer. We find that the weights should converge to a class-based PCA, with some weights in every layer dedicated to principal components of each label class. The class-based PCA allows us to train layers directly, without SGD, often leading to a dramatic decrease in training complexity. We demonstrate the effectiveness of this by using our results to replace one and two convolutional layers in networks trained on MNIST, CIFAR10 and CIFAR100 datasets, showing that our method achieves performance superior or comparable to similar architectures trained using SGD.

Original languageEnglish (US)
Title of host publication2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3232-3236
Number of pages5
ISBN (Electronic)9781479981311
DOIs
StatePublished - May 1 2019
Event44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, United Kingdom
Duration: May 12 2019May 17 2019

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2019-May
ISSN (Print)1520-6149

Conference

Conference44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
CountryUnited Kingdom
CityBrighton
Period5/12/195/17/19

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Direct Estimation of Weights and Efficient Training of Deep Neural Networks without SGD'. Together they form a unique fingerprint.

  • Cite this

    Dehmamy, N., Rohani, N., & Katsaggelos, A. K. (2019). Direct Estimation of Weights and Efficient Training of Deep Neural Networks without SGD. In 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings (pp. 3232-3236). [8682781] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2019-May). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2019.8682781