On the statistical rate of nonlinear recovery in generative models with heavy-tailed data

Xiaohan Wei*, Zhuoran Yang, Zhaoran Wang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

We consider estimating a high-dimensional vector from non-linear measurements where the unknown vector is represented by a generative model G : ℝk → ℝd with k << d. Such a model poses structural priors on the unknown vector without having a dedicated basis, and in particular allows new and efficient approaches solving recovery problems with number of measurements far less than the ambient dimension of the vector. While progresses have been made recently regarding theoretical understandings on the linear Gaussian measurements, much less is known when the model is possibly misspecified and the measurements are non-Gaussian. In this paper, we make a step towards such a direction by considering the scenario where the measurements are non-Gaussian, subject to possibly unknown nonlinear transformations and the responses are heavy-tailed. We then propose new estimators via score functions based on the first and second order Stein's identity, and prove the sample size bound of m = O(kε-2 log(L/ε)) achieving an ε error in the form of exponential concentration inequalities. Furthermore, for the special case of multi-layer ReLU generative model, we improve the sample bound by a logarithm factor to m = O(kε-2 log(d)), matching the state-of-art statistical rate in compressed sensing for estimating k-sparse vectors. On the technical side, we develop new chaining methods bounding heavy-tailed processes, which could be of independent interest.

Original languageEnglish (US)
Title of host publication36th International Conference on Machine Learning, ICML 2019
PublisherInternational Machine Learning Society (IMLS)
Pages11606-11643
Number of pages38
ISBN (Electronic)9781510886988
StatePublished - 2019
Event36th International Conference on Machine Learning, ICML 2019 - Long Beach, United States
Duration: Jun 9 2019Jun 15 2019

Publication series

Name36th International Conference on Machine Learning, ICML 2019
Volume2019-June

Conference

Conference36th International Conference on Machine Learning, ICML 2019
CountryUnited States
CityLong Beach
Period6/9/196/15/19

ASJC Scopus subject areas

  • Education
  • Computer Science Applications
  • Human-Computer Interaction

Fingerprint Dive into the research topics of 'On the statistical rate of nonlinear recovery in generative models with heavy-tailed data'. Together they form a unique fingerprint.

Cite this