TY - JOUR
T1 - On Tighter Generalization Bounds for Deep Neural Networks
T2 - CNNs, ResNets, and beyond
AU - Li, Xingguo
AU - Lu, Junwei
AU - Wang, Zhaoran
AU - Haupt, Jarvis
AU - Zhao, Tuo
N1 - Publisher Copyright:
Copyright © 2018, The Authors. All rights reserved.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2018/6/13
Y1 - 2018/6/13
N2 - We establish a margin based data dependent generalization error bound for a general family of deep neural networks in terms of the depth and width of the networks, as well as the spectral norm of weight matrices. Through introducing a new characterization of the Lipschitz properties of neural network family, we achieve a tighter generalization error bound. Moreover, we show that the generalization bound can be further improved for bounded losses. In addition, we demonstrate that the margin scales with the product of norm, which eliminate the concern on the vacuity of the norm based bound. Aside from the general feedforward deep neural networks, our results can be applied to derive new bounds for several popular architectures, including convolutional neural networks (CNNs), residual networks (ResNets), and hyperspherical networks (SphereNets). When achieving same generalization errors with previous arts, our bounds allow for the choice of larger parameter spaces of weight matrices, inducing potentially stronger expressive ability for neural networks. Moreover, we discuss the limitation of existing generalization bounds for understanding deep neural networks with ReLU activations in classification.
AB - We establish a margin based data dependent generalization error bound for a general family of deep neural networks in terms of the depth and width of the networks, as well as the spectral norm of weight matrices. Through introducing a new characterization of the Lipschitz properties of neural network family, we achieve a tighter generalization error bound. Moreover, we show that the generalization bound can be further improved for bounded losses. In addition, we demonstrate that the margin scales with the product of norm, which eliminate the concern on the vacuity of the norm based bound. Aside from the general feedforward deep neural networks, our results can be applied to derive new bounds for several popular architectures, including convolutional neural networks (CNNs), residual networks (ResNets), and hyperspherical networks (SphereNets). When achieving same generalization errors with previous arts, our bounds allow for the choice of larger parameter spaces of weight matrices, inducing potentially stronger expressive ability for neural networks. Moreover, we discuss the limitation of existing generalization bounds for understanding deep neural networks with ReLU activations in classification.
UR - http://www.scopus.com/inward/record.url?scp=85095030946&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85095030946&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:85095030946
JO - Free Radical Biology and Medicine
JF - Free Radical Biology and Medicine
SN - 0891-5849
ER -