Revisiting Deep Learning Models for Tabular Data

Gorishniy, Yury; Rubachev, Ivan; Khrulkov, Valentin; Babenko, Artem

Computer Science > Machine Learning

arXiv:2106.11959 (cs)

[Submitted on 22 Jun 2021 (v1), last revised 26 Oct 2023 (this version, v5)]

Title:Revisiting Deep Learning Models for Tabular Data

Authors:Yury Gorishniy, Ivan Rubachev, Valentin Khrulkov, Artem Babenko

View PDF

Abstract:The existing literature on deep learning for tabular data proposes a wide range of novel architectures and reports competitive results on various datasets. However, the proposed models are usually not properly compared to each other and existing works often use different benchmarks and experiment protocols. As a result, it is unclear for both researchers and practitioners what models perform best. Additionally, the field still lacks effective baselines, that is, the easy-to-use models that provide competitive performance across different problems.
In this work, we perform an overview of the main families of DL architectures for tabular data and raise the bar of baselines in tabular DL by identifying two simple and powerful deep architectures. The first one is a ResNet-like architecture which turns out to be a strong baseline that is often missing in prior works. The second model is our simple adaptation of the Transformer architecture for tabular data, which outperforms other solutions on most tasks. Both models are compared to many existing architectures on a diverse set of tasks under the same training and tuning protocols. We also compare the best DL models with Gradient Boosted Decision Trees and conclude that there is still no universally superior solution.

Comments:	NeurIPS 2021 camera-ready. Code: this https URL (v3-v5: minor changes)
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2106.11959 [cs.LG]
	(or arXiv:2106.11959v5 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2106.11959

Submission history

From: Yury Gorishniy [view email]
[v1] Tue, 22 Jun 2021 17:58:10 UTC (1,135 KB)
[v2] Wed, 10 Nov 2021 18:52:23 UTC (2,309 KB)
[v3] Wed, 26 Jul 2023 15:57:25 UTC (1,158 KB)
[v4] Wed, 25 Oct 2023 17:59:45 UTC (1,158 KB)
[v5] Thu, 26 Oct 2023 12:00:03 UTC (1,158 KB)

Computer Science > Machine Learning

Title:Revisiting Deep Learning Models for Tabular Data

Submission history

Access Paper:

References & Citations

2 blog links

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Revisiting Deep Learning Models for Tabular Data

Submission history

Access Paper:

References & Citations

2 blog links

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators