Neuron-based Pruning of Deep Neural Networks with Better Generalization using Kronecker Factored Curvature Approximation

Abdolghani Ebrahimi, Diego Klabjan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Existing methods of pruning deep neural networks focus on removing unnecessary parameters of the trained net-work and fine tuning the model afterwards to find a good solution that recovers the initial performance of the trained model. Unlike other works, our method pays special attention to the quality of the solution in the compressed model and inference computation time by pruning neurons. The proposed algorithm directs the parameters of the compressed model toward a flatter solution by exploring the spectral radius of Hessian which results in better generalization on unseen data. Moreover, the method does not work with a pre-trained network and performs training and pruning simultaneously. Our result shows that it improves the state-of-the-art results on neuron compression. The method is able to achieve very small networks with small accuracy degradation across different neural network models.

Original languageEnglish (US)
Title of host publicationIJCNN 2023 - International Joint Conference on Neural Networks, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665488679
DOIs
StatePublished - 2023
Event2023 International Joint Conference on Neural Networks, IJCNN 2023 - Gold Coast, Australia
Duration: Jun 18 2023Jun 23 2023

Publication series

NameProceedings of the International Joint Conference on Neural Networks
Volume2023-June

Conference

Conference2023 International Joint Conference on Neural Networks, IJCNN 2023
Country/TerritoryAustralia
CityGold Coast
Period6/18/236/23/23

Keywords

  • compression
  • neural network

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Neuron-based Pruning of Deep Neural Networks with Better Generalization using Kronecker Factored Curvature Approximation'. Together they form a unique fingerprint.

Cite this