GENERALIZATION OF SCALED DEEP RESNETS IN THE MEAN-FIELD REGIME

Yihang Chen*, Fanghui Liu*, Yiping Lu, Grigorios G. Chrysos, Volkan Cevher

*Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

1 Scopus citations

Abstract

Despite the widespread empirical success of ResNet, the generalization properties of deep ResNet are rarely explored beyond the lazy training regime. In this work, we investigate scaled ResNet in the limit of infinitely deep and wide neural networks, of which the gradient flow is described by a partial differential equation in the large-neural network limit, i.e., the mean-field regime. To derive the generalization bounds under this setting, our analysis necessitates a shift from the conventional time-invariant Gram matrix employed in the lazy training regime to a time-variant, distribution-dependent version. To this end, we provide a global lower bound on the minimum eigenvalue of the Gram matrix under the mean-field regime. Besides, for the traceability of the dynamic of Kullback-Leibler (KL) divergence, we establish the linear convergence of the empirical error and estimate the upper bound of the KL divergence over parameters distribution. Finally, we build the uniform convergence for generalization bound via Rademacher complexity. Our results offer new insights into the generalization ability of deep ResNet beyond the lazy training regime and contribute to advancing the understanding of the fundamental properties of deep neural networks.

Original languageEnglish (US)
StatePublished - 2024
Event12th International Conference on Learning Representations, ICLR 2024 - Hybrid, Vienna, Austria
Duration: May 7 2024May 11 2024

Conference

Conference12th International Conference on Learning Representations, ICLR 2024
Country/TerritoryAustria
CityHybrid, Vienna
Period5/7/245/11/24

Funding

This work was carried out in the EPFL LIONS group. This work was supported by Hasler Foundation Program: Hasler Responsible AI (project number 21043), the Army Research Office and was accomplished under Grant Number W911NF-24-1-0048, and Swiss National Science Foundation (SNSF) under grant number 200021_205011. Corresponding authors: Fanghui Liu and Yihang Chen.

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science Applications
  • Education
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'GENERALIZATION OF SCALED DEEP RESNETS IN THE MEAN-FIELD REGIME'. Together they form a unique fingerprint.

Cite this