TY - JOUR
T1 - LassoNet
T2 - 24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021
AU - Lemhadri, Ismael
AU - Ruan, Feng
AU - Tibshirani, Robert
N1 - Funding Information:
We would like to thank John Duchi and Ryan Tibshirani for helpful comments. We would like to thank Louis Abraham for help with the implementation of LassoNet. Robert Tibshirani was supported by NIH grant 5R01 EB001988-16 and NSF grant 19 DMS1208164.
Funding Information:
We would like to thank John Duchi and Ryan Tib-shirani for helpful comments. We would like to thank Louis Abraham for help with the implementation of LassoNet. Robert Tibshirani was supported by NIH grant 5R01 EB001988-16 and NSF grant 19 DMS1208164.
Publisher Copyright:
Copyright © 2021 by the author(s)
PY - 2021
Y1 - 2021
N2 - Much work has been done recently to make neural networks more interpretable, and one approach is to arrange for the network to use only a subset of the available features. In linear models, Lasso (or l1-regularized) regression assigns zero weights to the most irrelevant or redundant features, and is widely used in data science. However the Lasso only applies to linear models. Here we introduce LassoNet, a neural network framework with global feature selection. Our approach achieves feature sparsity by allowing a feature to participate in a hidden unit only if its linear representative is active. Unlike other approaches to feature selection for neural nets, our method uses a modified objective function with constraints, and so integrates feature selection with the parameter learning directly. As a result, it delivers an entire regularization path of solutions with a range of feature sparsity. In experiments with real and simulated data, LassoNet significantly outperforms state-of-the-art methods for feature selection and regression. The LassoNet method uses projected proximal gradient descent, and generalizes directly to deep networks. It can be implemented by adding just a few lines of code to a standard neural network.
AB - Much work has been done recently to make neural networks more interpretable, and one approach is to arrange for the network to use only a subset of the available features. In linear models, Lasso (or l1-regularized) regression assigns zero weights to the most irrelevant or redundant features, and is widely used in data science. However the Lasso only applies to linear models. Here we introduce LassoNet, a neural network framework with global feature selection. Our approach achieves feature sparsity by allowing a feature to participate in a hidden unit only if its linear representative is active. Unlike other approaches to feature selection for neural nets, our method uses a modified objective function with constraints, and so integrates feature selection with the parameter learning directly. As a result, it delivers an entire regularization path of solutions with a range of feature sparsity. In experiments with real and simulated data, LassoNet significantly outperforms state-of-the-art methods for feature selection and regression. The LassoNet method uses projected proximal gradient descent, and generalizes directly to deep networks. It can be implemented by adding just a few lines of code to a standard neural network.
UR - http://www.scopus.com/inward/record.url?scp=85159308133&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85159308133&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85159308133
SN - 2640-3498
VL - 130
SP - 10
EP - 18
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
Y2 - 13 April 2021 through 15 April 2021
ER -