Density estimation for shift-invariant multidimensional distributions

Anindya De, Philip M. Long, Rocco A. Servedio

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We study density estimation for classes of shift-invariant distributions over Rd. A multidimensional distribution is “shift-invariant” if, roughly speaking, it is close in total variation distance to a small shift of it in any direction. Shift-invariance relaxes smoothness assumptions commonly used in non-parametric density estimation to allow jump discontinuities. The different classes of distributions that we consider correspond to different rates of tail decay. For each such class we give an efficient algorithm that learns any distribution in the class from independent samples with respect to total variation distance. As a special case of our general result, we show that d-dimensional shift-invariant distributions which satisfy an exponential tail bound can be learned to total variation distance error ε using Õd(1/εd+2) examples and Õd(1/ε2d+2) time. This implies that, for constant d, multivariate log-concave distributions can be learned in Õd(1/ε2d+2) time using Õd(1/εd+2) samples, answering a question of [29]. All of our results extend to a model of noise-tolerant density estimation using Huber’s contamination model, in which the target distribution to be learned is a (1 − ε, ε) mixture of some unknown distribution in the class with some other arbitrary and unknown distribution, and the learning algorithm must output a hypothesis distribution with total variation distance error O(ε) from the target distribution. We show that our general results are close to best possible by proving a simple Ω 1/εd information-theoretic lower bound on sample complexity even for learning bounded distributions that are shift-invariant.

Original languageEnglish (US)
Title of host publication10th Innovations in Theoretical Computer Science, ITCS 2019
EditorsAvrim Blum
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
ISBN (Electronic)9783959770958
DOIs
StatePublished - Jan 1 2019
Event10th Innovations in Theoretical Computer Science, ITCS 2019 - San Diego, United States
Duration: Jan 10 2019Jan 12 2019

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
Volume124
ISSN (Print)1868-8969

Conference

Conference10th Innovations in Theoretical Computer Science, ITCS 2019
CountryUnited States
CitySan Diego
Period1/10/191/12/19

Keywords

  • Density estimation
  • Log-concave distributions
  • Non-parametrics
  • Unsupervised learning

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'Density estimation for shift-invariant multidimensional distributions'. Together they form a unique fingerprint.

Cite this