Sparse PCA with oracle property

Quanquan Gu, Zhaoran Wang, Han Liu

Research output: Contribution to journalConference article

11 Citations (Scopus)

Abstract

In this paper, we study the estimation of the k-dimensional sparse principal sub-space of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank-k, and attains a √s/n statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets.

Original languageEnglish (US)
Pages (from-to)1529-1537
Number of pages9
JournalAdvances in Neural Information Processing Systems
Volume2
Issue numberJanuary
StatePublished - Jan 1 2014
Event28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014 - Montreal, Canada
Duration: Dec 8 2014Dec 13 2014

Fingerprint

Hinges
Covariance matrix
Recovery
Experiments

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Cite this

@article{9f9c4191e92549089f01a6de5216860d,
title = "Sparse PCA with oracle property",
abstract = "In this paper, we study the estimation of the k-dimensional sparse principal sub-space of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank-k, and attains a √s/n statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets.",
author = "Quanquan Gu and Zhaoran Wang and Han Liu",
year = "2014",
month = "1",
day = "1",
language = "English (US)",
volume = "2",
pages = "1529--1537",
journal = "Advances in Neural Information Processing Systems",
issn = "1049-5258",
number = "January",

}

Sparse PCA with oracle property. / Gu, Quanquan; Wang, Zhaoran; Liu, Han.

In: Advances in Neural Information Processing Systems, Vol. 2, No. January, 01.01.2014, p. 1529-1537.

Research output: Contribution to journalConference article

TY - JOUR

T1 - Sparse PCA with oracle property

AU - Gu, Quanquan

AU - Wang, Zhaoran

AU - Liu, Han

PY - 2014/1/1

Y1 - 2014/1/1

N2 - In this paper, we study the estimation of the k-dimensional sparse principal sub-space of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank-k, and attains a √s/n statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets.

AB - In this paper, we study the estimation of the k-dimensional sparse principal sub-space of covariance matrix Σ in the high-dimensional setting. We aim to recover the oracle principal subspace solution, i.e., the principal subspace estimator obtained assuming the true support is known a priori. To this end, we propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations. In particular, under a weak assumption on the magnitude of the population projection matrix, one estimator within this family exactly recovers the true support with high probability, has exact rank-k, and attains a √s/n statistical rate of convergence with s being the subspace sparsity level and n the sample size. Compared to existing support recovery results for sparse PCA, our approach does not hinge on the spiked covariance model or the limited correlation condition. As a complement to the first estimator that enjoys the oracle property, we prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA, even when the previous assumption on the magnitude of the projection matrix is violated. We validate the theoretical results by numerical experiments on synthetic datasets.

UR - http://www.scopus.com/inward/record.url?scp=84937917562&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84937917562&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:84937917562

VL - 2

SP - 1529

EP - 1537

JO - Advances in Neural Information Processing Systems

JF - Advances in Neural Information Processing Systems

SN - 1049-5258

IS - January

ER -