Abstract
We introduce the Bi-Directional Sparse Hopfield Network (BiSHop), a novel end-to-end framework for tabular learning. BiSHop handles the two major challenges of deep tabular learning: non-rotationally invariant data structure and feature sparsity in tabular data. Our key motivation comes from the recently established connection between associative memory and attention mechanisms. Consequently, BiSHop uses a dual-component approach, sequentially processing data both column-wise and row-wise through two interconnected directional learning modules. Computationally, these modules house layers of generalized sparse modern Hopfield layers, a sparse extension of the modern Hopfield model with learnable sparsity. Methodologically, BiSHop facilitates multi-scale representation learning, capturing both intra-feature and inter-feature interactions, with adaptive sparsity at each scale. Empirically, through experiments on diverse real-world datasets, BiSHop surpasses current SOTA methods with significantly fewer HPO runs, marking it a robust solution for deep tabular learning. The code is available on GitHub; future updates are on arXiv.
Original language | English (US) |
---|---|
Pages (from-to) | 55048-55075 |
Number of pages | 28 |
Journal | Proceedings of Machine Learning Research |
Volume | 235 |
State | Published - 2024 |
Event | 41st International Conference on Machine Learning, ICML 2024 - Vienna, Austria Duration: Jul 21 2024 → Jul 27 2024 |
Funding
JH would like to thank Dino Feng and Andrew Chen for enlightening discussions, the Red Maple Family for support, and Jiayi Wang for facilitating experimental deployments. CX would like to thank Yibo Wen for helpful comments. The authors would also like to thank the anonymous reviewers and program chairs for their constructive comments. JH is partially supported by the Walter P. Murphy Fellowship. HL is partially supported by NIH R01LM1372201, NSF CAREER1841569, DOE DE-AC02-07CH11359, DOE LAB 20-2261 and a NSF TRIPODS1740735. H.-S.G. acknowledges support from the National Science and Technology Council, Taiwan under Grants No. NSTC 113-2119-M-002-021, No. NSTC112-2119-M-002-014, No. NSTC 111-2119-M-002-007, and No. NSTC 111-2627-M-002-001, from the US Air Force Office of Scientific Research under Award Number FA2386-20-1-4052, and from the National Taiwan University under Grants No. NTU-CC-112L893404 and No. NTU-CC-113L891604. H.-S.G. is also grateful for the support from the \u201CCenter for Advanced Computing and Imaging in Biomedicine (NTU-113L900702)\u201D through The Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education (MOE), Taiwan, and the support from the Physics Division, National Center for Theoretical Sciences, Taiwan. This research was supported in part through the computational resources and staff contributions provided for the Quest high performance computing facility at Northwestern University which is jointly supported by the Office of the Provost, the Office for Research, and Northwestern University Information Technology. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability