Efficient Algorithms for Learning Depth-2 Neural Networks with General ReLU Activations

Pranjal Awasthi, Alex Tang, Aravindan Vijayaraghavan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present polynomial time and sample efficient algorithms for learning an unknown depth-2 feedforward neural network with general ReLU activations, under mild non-degeneracy assumptions. In particular, we consider learning an unknown network of the form f (x) = aTσ(W Tx + b), where x is drawn from the Gaussian distribution, and σ(t):= max(t, 0) is the ReLU activation. Prior works for learning networks with ReLU activations assume that the bias b is zero. In order to deal with the presence of the bias terms, our proposed algorithm consists of robustly decomposing multiple higher order tensors arising from the Hermite expansion of the function f (x). Using these ideas we also establish identifiability of the network parameters under minimal assumptions.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
EditorsMarc'Aurelio Ranzato, Alina Beygelzimer, Yann Dauphin, Percy S. Liang, Jenn Wortman Vaughan
PublisherNeural information processing systems foundation
Pages13485-13496
Number of pages12
ISBN (Electronic)9781713845393
StatePublished - 2021
Event35th Conference on Neural Information Processing Systems, NeurIPS 2021 - Virtual, Online
Duration: Dec 6 2021Dec 14 2021

Publication series

NameAdvances in Neural Information Processing Systems
Volume16
ISSN (Print)1049-5258

Conference

Conference35th Conference on Neural Information Processing Systems, NeurIPS 2021
CityVirtual, Online
Period12/6/2112/14/21

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Efficient Algorithms for Learning Depth-2 Neural Networks with General ReLU Activations'. Together they form a unique fingerprint.

Cite this