TY - GEN
T1 - Efficient Algorithms for Learning Depth-2 Neural Networks with General ReLU Activations
AU - Awasthi, Pranjal
AU - Tang, Alex
AU - Vijayaraghavan, Aravindan
N1 - Funding Information:
The last two authors are supported by the National Science Foundation (NSF) under Grant No. CCF-1652491, CCF-1637585 and CCF-1934931.
Publisher Copyright:
© 2021 Neural information processing systems foundation. All rights reserved.
PY - 2021
Y1 - 2021
N2 - We present polynomial time and sample efficient algorithms for learning an unknown depth-2 feedforward neural network with general ReLU activations, under mild non-degeneracy assumptions. In particular, we consider learning an unknown network of the form f (x) = aTσ(W Tx + b), where x is drawn from the Gaussian distribution, and σ(t):= max(t, 0) is the ReLU activation. Prior works for learning networks with ReLU activations assume that the bias b is zero. In order to deal with the presence of the bias terms, our proposed algorithm consists of robustly decomposing multiple higher order tensors arising from the Hermite expansion of the function f (x). Using these ideas we also establish identifiability of the network parameters under minimal assumptions.
AB - We present polynomial time and sample efficient algorithms for learning an unknown depth-2 feedforward neural network with general ReLU activations, under mild non-degeneracy assumptions. In particular, we consider learning an unknown network of the form f (x) = aTσ(W Tx + b), where x is drawn from the Gaussian distribution, and σ(t):= max(t, 0) is the ReLU activation. Prior works for learning networks with ReLU activations assume that the bias b is zero. In order to deal with the presence of the bias terms, our proposed algorithm consists of robustly decomposing multiple higher order tensors arising from the Hermite expansion of the function f (x). Using these ideas we also establish identifiability of the network parameters under minimal assumptions.
UR - http://www.scopus.com/inward/record.url?scp=85125031140&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85125031140&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85125031140
T3 - Advances in Neural Information Processing Systems
SP - 13485
EP - 13496
BT - Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
A2 - Ranzato, Marc'Aurelio
A2 - Beygelzimer, Alina
A2 - Dauphin, Yann
A2 - Liang, Percy S.
A2 - Wortman Vaughan, Jenn
PB - Neural information processing systems foundation
T2 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
Y2 - 6 December 2021 through 14 December 2021
ER -