A better understanding of pruning methods based on a ranking of weights according to their saliency in a trained network requires further information on the statistical properties of such saliencies. We focus on two-layer networks with either a linear or nonlinear output unit, and obtain analytic expressions for the distribution of saliencies and their logarithms. Our results reveal unexpected universal properties of the log-saliency distribution and suggest a novel algorithm for saliency-based weight ranking that avoids the numerical cost of second derivative evaluations.
ASJC Scopus subject areas
- Computer Networks and Communications