TY - GEN
T1 - Smoothed Analysis in Unsupervised Learning via Decoupling
AU - Bhaskara, Aditya
AU - Chen, Aidao
AU - Perreault, Aidan
AU - Vijayaraghavan, Aravindan
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/11
Y1 - 2019/11
N2 - Smoothed analysis is a powerful paradigm in overcoming worst-case intractability in unsupervised learning and high-dimensional data analysis. While polynomial time smoothed analysis guarantees have been obtained for worst-case intractable problems like tensor decompositions and learning mixtures of Gaussians, such guarantees have been hard to obtain for several other important problems in unsupervised learning. A core technical challenge in analyzing algorithms is obtaining lower bounds on the least singular value for random matrix ensembles with dependent entries, that are given by low-degree polynomials of a few base underlying random variables. In this work, we address this challenge by obtaining high-confidence lower bounds on the least singular value of new classes of structured random matrix ensembles of the above kind. We then use these bounds to design algorithms with polynomial time smoothed analysis guarantees for the following three important problems in unsupervised learning: •Robust subspace recovery, when the fraction of inliers in the d-dimensional subspace T of the n-dimensional Euclidean space is at least (d/n)t for any positive integer t. This contrasts with the known worst-case intractability when the fraction of inliers is at most d/n, and the previous smoothed analysis result (Hardt and Moitra, 2013). •Learning overcomplete hidden markov models, where the size of the state space is any polynomial in the dimension of the observations. This gives the first polynomial time guarantees for learning overcomplete HMMs in the smoothed analysis model. •Higher order tensor decompositions, where we generalize and analyze the so-called FOOBI algorithm of Cardoso to find order-T rank-one tensors in a subspace. This gives polynomially robust decomposition algorithms for order-2t tensors with rank nt.
AB - Smoothed analysis is a powerful paradigm in overcoming worst-case intractability in unsupervised learning and high-dimensional data analysis. While polynomial time smoothed analysis guarantees have been obtained for worst-case intractable problems like tensor decompositions and learning mixtures of Gaussians, such guarantees have been hard to obtain for several other important problems in unsupervised learning. A core technical challenge in analyzing algorithms is obtaining lower bounds on the least singular value for random matrix ensembles with dependent entries, that are given by low-degree polynomials of a few base underlying random variables. In this work, we address this challenge by obtaining high-confidence lower bounds on the least singular value of new classes of structured random matrix ensembles of the above kind. We then use these bounds to design algorithms with polynomial time smoothed analysis guarantees for the following three important problems in unsupervised learning: •Robust subspace recovery, when the fraction of inliers in the d-dimensional subspace T of the n-dimensional Euclidean space is at least (d/n)t for any positive integer t. This contrasts with the known worst-case intractability when the fraction of inliers is at most d/n, and the previous smoothed analysis result (Hardt and Moitra, 2013). •Learning overcomplete hidden markov models, where the size of the state space is any polynomial in the dimension of the observations. This gives the first polynomial time guarantees for learning overcomplete HMMs in the smoothed analysis model. •Higher order tensor decompositions, where we generalize and analyze the so-called FOOBI algorithm of Cardoso to find order-T rank-one tensors in a subspace. This gives polynomially robust decomposition algorithms for order-2t tensors with rank nt.
KW - anti concentration
KW - beyond worst-case analysis
KW - hidden markov model
KW - smoothed analysis
KW - subspace recovery
KW - tensor decomposition
KW - unsupervised learning
UR - http://www.scopus.com/inward/record.url?scp=85078432975&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85078432975&partnerID=8YFLogxK
U2 - 10.1109/FOCS.2019.00043
DO - 10.1109/FOCS.2019.00043
M3 - Conference contribution
AN - SCOPUS:85078432975
T3 - Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS
SP - 582
EP - 610
BT - Proceedings - 2019 IEEE 60th Annual Symposium on Foundations of Computer Science, FOCS 2019
PB - IEEE Computer Society
T2 - 60th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2019
Y2 - 9 November 2019 through 12 November 2019
ER -