Optimization for large-scale machine learning with distributed features and observations

Alexandros Nathan*, Diego Klabjan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

As the size of modern data sets exceeds the disk and memory capacities of a single computer, machine learning practitioners have resorted to parallel and distributed computing. Given that optimization is one of the pillars of machine learning and predictive modeling, distributed optimization methods have recently garnered ample attention in the literature. Although previous research has mostly focused on settings where either the observations, or features of the problem at hand are stored in distributed fashion, the situation where both are partitioned across the nodes of a computer cluster (doubly distributed) has barely been studied. In this work we propose two doubly distributed optimization algorithms. The first one falls under the umbrella of distributed dual coordinate ascent methods, while the second one belongs to the class of stochastic gradient/coordinate descent hybrid methods. We conduct numerical experiments in Spark using real-world and simulated data sets and study the scaling properties of our methods. Our empirical evaluation of the proposed algorithms demonstrates the outperformance of a block distributed ADMM method, which, to the best of our knowledge is the only other existing doubly distributed optimization algorithm.

Original languageEnglish (US)
Title of host publicationMachine Learning and Data Mining in Pattern Recognition - 13th International Conference, MLDM 2017, Proceedings
EditorsPetra Perner
PublisherSpringer Verlag
Pages132-146
Number of pages15
ISBN (Print)9783319624150
DOIs
StatePublished - Jan 1 2017
Event13th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2017 - New York, United States
Duration: Jul 15 2017Jul 20 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10358 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other13th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2017
CountryUnited States
CityNew York
Period7/15/177/20/17

Keywords

  • Big data
  • Distributed optimization
  • Machine learning
  • Spark

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Optimization for large-scale machine learning with distributed features and observations'. Together they form a unique fingerprint.

  • Cite this

    Nathan, A., & Klabjan, D. (2017). Optimization for large-scale machine learning with distributed features and observations. In P. Perner (Ed.), Machine Learning and Data Mining in Pattern Recognition - 13th International Conference, MLDM 2017, Proceedings (pp. 132-146). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10358 LNAI). Springer Verlag. https://doi.org/10.1007/978-3-319-62416-7_10