Exact learning dynamics of deep linear networks with prior knowledge

Lukas Braun, Clémentine C.J. Dominé, James E. Fitzgerald, Andrew M. Saxe

Research output: Chapter in Book/Report/Conference proceedingConference contribution

17 Scopus citations

Abstract

Learning in deep neural networks is known to depend critically on the knowledge embedded in the initial network weights. However, few theoretical results have precisely linked prior knowledge to learning dynamics. Here we derive exact solutions to the dynamics of learning with rich prior knowledge in deep linear networks by generalising Fukumizu's matrix Riccati solution [1]. We obtain explicit expressions for the evolving network function, hidden representational similarity, and neural tangent kernel over training for a broad class of initialisations and tasks. The expressions reveal a class of task-independent initialisations that radically alter learning dynamics from slow non-linear dynamics to fast exponential trajectories while converging to a global optimum with identical representational similarity, dissociating learning trajectories from the structure of initial internal representations. We characterise how network weights dynamically align with task structure, rigorously justifying why previous solutions successfully described learning from small initial weights without incorporating their fine-scale structure. Finally, we discuss the implications of these findings for continual learning, reversal learning and learning of structured knowledge. Taken together, our results provide a mathematical toolkit for understanding the impact of prior knowledge on deep learning.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
EditorsS. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, A. Oh
PublisherNeural information processing systems foundation
ISBN (Electronic)9781713871088
StatePublished - 2022
Event36th Conference on Neural Information Processing Systems, NeurIPS 2022 - New Orleans, United States
Duration: Nov 28 2022Dec 9 2022

Publication series

NameAdvances in Neural Information Processing Systems
Volume35
ISSN (Print)1049-5258

Conference

Conference36th Conference on Neural Information Processing Systems, NeurIPS 2022
Country/TerritoryUnited States
CityNew Orleans
Period11/28/2212/9/22

Funding

L.B. was supported by the Woodward Scholarship awarded by Wadham College, Oxford and the Medical Research Council [MR/N013468/1]. C.D. and A.S. were supported by the Gatsby Charitable Foundation (GAT3755). Further, A.S. was supported by a Sir Henry Dale Fellowship from the Wellcome Trust and Royal Society (216386/Z/19/Z) and the Sainsbury Wellcome Centre Core Grant (219627/Z/19/Z). A.S. is a CIFAR Azrieli Global Scholar in the Learning in Machines & Brains program. J.F. was supported by the Howard Hughes Medical Institute.

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Exact learning dynamics of deep linear networks with prior knowledge'. Together they form a unique fingerprint.

Cite this