Heterogeneous Feature Fusion Based Machine Learning on Shallow-Wide and Heterogeneous-Sparse Industrial Datasets

Zijiang Yang*, Tetsushi Watari, Daisuke Ichigozaki, Akita Mitsutoshi, Hiroaki Takahashi, Yoshinori Suga, Wei keng Liao, Alok Choudhary, Ankit Agrawal

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Although machine learning has gained great success in industry, there are still many challenges in mining industrial data, especially in manufacturing domains. Because industrial data can be 1) shallow and wide, 2) highly heterogeneous and sparse. Particularly, mining on sparse data (i.e. data with missing features) is extremely challenging, because it is not easy to fill in some features (e.g. images), and removing data points would reduce the data size further. Thus, in this work, we propose a machine learning framework including transfer learning, heterogeneous feature fusion, principal component analysis and gradient boosting to solve these challenges and effectively develop predictive models on industrial datasets. Compared to a non-fusion method and a traditional fusion method on two real world datasets from Toyota Motor Corporation, the results show that the proposed method can not only maximize the utility of available features and data to achieve more stable and better performance, but also give more flexibility when predicting new unseen data points with only partial set of features available.(Code and data are available at: https://github.com/zyz293/FusionML.)

Original languageEnglish (US)
Title of host publicationPattern Recognition. ICPR International Workshops and Challenges, 2021, Proceedings
EditorsAlberto Del Bimbo, Rita Cucchiara, Stan Sclaroff, Giovanni Maria Farinella, Tao Mei, Marco Bertini, Hugo Jair Escalante, Roberto Vezzani
PublisherSpringer Science and Business Media Deutschland GmbH
Pages566-577
Number of pages12
ISBN (Print)9783030687984
DOIs
StatePublished - 2021
Event25th International Conference on Pattern Recognition Workshops, ICPR 2020 - Virtual, Online
Duration: Jan 10 2021Jan 15 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12664 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference25th International Conference on Pattern Recognition Workshops, ICPR 2020
CityVirtual, Online
Period1/10/211/15/21

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Heterogeneous Feature Fusion Based Machine Learning on Shallow-Wide and Heterogeneous-Sparse Industrial Datasets'. Together they form a unique fingerprint.

Cite this