Theoretical Analysis of Weak-to-Strong Generalization

Hunter Lang, David Sontag, Aravindan Vijayaraghavan

Research output: Contribution to journalConference articlepeer-review

Abstract

Strong student models can learn from weaker teachers: when trained on the predictions of a weaker model, a strong pretrained student can learn to correct the weak model's errors and generalize to examples where the teacher is not confident, even when these examples are excluded from training. This enables learning from cheap, incomplete, and possibly incorrect label information, such as coarse logical rules or the generations of a language model. We show that existing weak supervision theory fails to account for both of these effects, which we call pseudolabel correction and coverage expansion, respectively. We give a new bound based on expansion properties of the data distribution and student hypothesis class that directly accounts for pseudolabel correction and coverage expansion. Our bounds capture the intuition that weak-to-strong generalization occurs when the strong model is unable to fit the mistakes of the weak teacher without incurring additional error. We show that these expansion properties can be checked from finite data and give empirical evidence that they hold in practice.

Original languageEnglish (US)
JournalAdvances in Neural Information Processing Systems
Volume37
StatePublished - 2024
Event38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, Canada
Duration: Dec 9 2024Dec 15 2024

Funding

Thanks to Hussein Mozannar and Ilker Demirel for feedback on drafts of this paper. DS and HL were partially supported by NSF AitF award CCF-1723344, AV was supported by NSF grants EECS2216970 and CCF-2154100, and HL was supported by the NDSEG fellowship. Finally, this project was partially supported by an OpenAI \u201CSuperalignment Fast\u201D grant.

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Theoretical Analysis of Weak-to-Strong Generalization'. Together they form a unique fingerprint.

Cite this