Exact and inexact subsampled Newton methods for optimization

Raghu Bollapragada, Richard H. Byrd, Jorge Nocedal*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

86 Scopus citations

Abstract

The paper studies the solution of stochastic optimization problems in which approximations to the gradient and Hessian are obtained through subsampling. We first consider Newton-like methods that employ these approximations and discuss how to coordinate the accuracy in the gradient and Hessian to yield a superlinear rate of convergence in expectation. The second part of the paper analyzes an inexact Newton method that solves linear systems approximately using the conjugate gradient (CG) method, and that samples the Hessian and not the gradient (the gradient is assumed to be exact). We provide a complexity analysis for this method based on the properties of the CG iteration and the quality of the Hessian approximation, and compare it with a method that employs a stochastic gradient iteration instead of the CG method. We report preliminary numerical results that illustrate the performance of inexact subsampled Newton methods on machine learning applications based on logistic regression.

Original languageEnglish (US)
Pages (from-to)545-548
Number of pages4
JournalIMA Journal of Numerical Analysis
Volume39
Issue number2
DOIs
StatePublished - Jan 1 2019

Funding

Raghu Bollapragada was supported by the Office of Naval Research award N000141410313. Richard Byrd was supported by the National Science Foundation grant DMS-1620070. Jorge Nocedal was supported by the Department of Energy grant DE-FG02-87ER25047 and the National Science Foundation grant DMS-1620022.

Keywords

  • machine learning
  • stochastic optimization
  • subsampling

ASJC Scopus subject areas

  • General Mathematics
  • Computational Mathematics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Exact and inexact subsampled Newton methods for optimization'. Together they form a unique fingerprint.

Cite this