[NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets

Overview

[NeurIPS 2021] Well-tuned Simple Nets Excel on Tabular Datasets

Introduction

This repo contains the source code accompanying the paper:

Well-tuned Simple Nets Excel on Tabular Datasets

Authors: Arlind Kadra, Marius Lindauer, Frank Hutter, Josif Grabocka

Tabular datasets are the last "unconquered castle" for deep learning, with traditional ML methods like Gradient-Boosted Decision Trees still performing strongly even against recent specialized neural architectures. In this paper, we hypothesize that the key to boosting the performance of neural networks lies in rethinking the joint and simultaneous application of a large set of modern regularization techniques. As a result, we propose regularizing plain Multilayer Perceptron (MLP) networks by searching for the optimal combination/cocktail of 13 regularization techniques for each dataset using a joint optimization over the decision on which regularizers to apply and their subsidiary hyperparameters.

We empirically assess the impact of these regularization cocktails for MLPs on a large-scale empirical study comprising 40 tabular datasets and demonstrate that: (i) well-regularized plain MLPs significantly outperform recent state-of-the-art specialized neural network architectures, and (ii) they even outperform strong traditional ML methods, such as XGBoost.

News: Our work is accepted in the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021).

Setting up the virtual environment

Our work is built on top of AutoPyTorch. To look at our implementation of the regularization cocktail ingredients, you can do the following:

git clone https://github.com/automl/Auto-PyTorch.git
cd Auto-PyTorch/
git checkout regularization_cocktails

To install the version of AutoPyTorch that features our work, you can use these additional commands:

# The following commands assume the user is in the cloned directory
conda create -n reg_cocktails python=3.8
conda activate reg_cocktails
conda install gxx_linux-64 gcc_linux-64 swig
cat requirements.txt | xargs -n 1 -L 1 pip install
python setup.py install

Running the Regularization Cocktail code

The main files to run the regularization cocktails are in the cocktails folder and are main_experiment.py and refit_experiment.py. The first module can be used to start a full HPO search, while, the other module can be used to refit on certain datasets when the time does not suffice to perform the full HPO search and to complete the refit of the incumbent hyperparameter configuration.

The main arguments for main_experiment.py:

  • --task_id: The task id in OpenML. Basically the dataset that will be used in the experiment.
  • --wall_time: The total runtime to be used. It is the total runtime for the HPO search and also final refit.
  • --func_eval_time: The maximal time for one function evaluation parametrized by a certain hyperparameter configuration.
  • --epochs: The number of epochs for one hyperparameter configuration to be evaluated on.
  • --seed: The seed to be used for the run.
  • --tmp_dir: The temporary directory for the results to be stored in.
  • --output_dir: The output directory for the results to be stored in.
  • --nr_workers: The number of workers which corresponds to the number of hyperparameter configurations run in parallel.
  • --nr_threads: The number of threads.
  • --cash_cocktail: An important flag that activates the regularization cocktail formulation.

A minimal example of running the regularization cocktails:

python main_experiment.py --task_id 233088 --wall_time 600 --func_eval_time 60 --epochs 10 --seed 42 --cash_cocktail True

The example above will run the regularization cocktails for 10 minutes, with a function evaluation limit of 50 seconds for task 233088. Every hyperparameter configuration will be evaluated for 10 epochs, the seed 42 will be used for the experiment and data splits.

A minimal example of running only one regularization method:

python main_experiment.py --task_id 233088 --wall_time 600 --func_eval_time 60 --epochs 10 --seed 42 --use_weight_decay

In case you would like to investigate individual regularization methods, you can look at the different arguments that control them in the main_experiment.py. Additionally, if you want to remove the limit on the number of hyperparameter configurations, you can remove the following lines:

smac_scenario_args={
    'runcount_limit': number_of_configurations_limit,
}

Plots

The plots that are included in our paper were generated from the functions in the module results.py. Although mentioned in most function documentations, most of the functions that plot the baseline diagrams and plots expect a folder structure as follows:

common_result_folder/baseline/results.csv

There are functions inside the module itself that generate the results.csv files.

Baselines

The code for running the baselines can be found in the baselines folder.

  • TabNet, XGBoost, CatBoost can be found in the baselines/bohb folder.
  • The other baselines like AutoGluon, auto-sklearn and Node can be found in the corresponding folders named the same.

TabNet, XGBoost, CatBoost and AutoGluon have the same two main files as our regularization cocktails, main_experiment.py and refit_experiment.py.

Figures

alt text

Citation

@article{kadra2021regularization,
  title={Regularization is all you Need: Simple Neural Nets can Excel on Tabular Data},
  author={Kadra, Arlind and Lindauer, Marius and Hutter, Frank and Grabocka, Josif},
  journal={arXiv preprint arXiv:2106.11189},
  year={2021}
}
Implementation of Hierarchical Transformer Memory (HTM) for Pytorch

Hierarchical Transformer Memory (HTM) - Pytorch Implementation of Hierarchical Transformer Memory (HTM) for Pytorch. This Deepmind paper proposes a si

Phil Wang 63 Dec 29, 2022
MCMC samplers for Bayesian estimation in Python, including Metropolis-Hastings, NUTS, and Slice

Sampyl May 29, 2018: version 0.3 Sampyl is a package for sampling from probability distributions using MCMC methods. Similar to PyMC3 using theano to

Mat Leonard 304 Dec 25, 2022
Speckle-free Holography with Partially Coherent Light Sources and Camera-in-the-loop Calibration

Speckle-free Holography with Partially Coherent Light Sources and Camera-in-the-loop Calibration Project Page | Paper Yifan Peng*, Suyeon Choi*, Jongh

Stanford Computational Imaging Lab 19 Dec 11, 2022
EMNLP'2021: SimCSE: Simple Contrastive Learning of Sentence Embeddings

SimCSE: Simple Contrastive Learning of Sentence Embeddings This repository contains the code and pre-trained models for our paper SimCSE: Simple Contr

Princeton Natural Language Processing 2.5k Dec 29, 2022
Unified unsupervised and semi-supervised domain adaptation network for cross-scenario face anti-spoofing, Pattern Recognition

USDAN The implementation of Unified unsupervised and semi-supervised domain adaptation network for cross-scenario face anti-spoofing, which is accepte

11 Nov 03, 2022
An interactive DNN Model deployed on web that predicts the chance of heart failure for a patient with an accuracy of 98%

Heart Failure Predictor About A Web UI deployed Dense Neural Network Model Made using Tensorflow that predicts whether the patient is healthy or has c

Adit Ahmedabadi 0 Jan 09, 2022
Rohit Ingole 2 Mar 24, 2022
Constrained Language Models Yield Few-Shot Semantic Parsers

Constrained Language Models Yield Few-Shot Semantic Parsers This repository contains tools and instructions for reproducing the experiments in the pap

Microsoft 43 Nov 23, 2022
:boar: :bear: Deep Learning based Python Library for Stock Market Prediction and Modelling

bulbea "Deep Learning based Python Library for Stock Market Prediction and Modelling." Table of Contents Installation Usage Documentation Dependencies

Achilles Rasquinha 1.8k Jan 05, 2023
A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python.

Xcessiv Xcessiv is a tool to help you create the biggest, craziest, and most excessive stacked ensembles you can think of. Stacked ensembles are simpl

Reiichiro Nakano 1.3k Nov 17, 2022
This repository contains an overview of important follow-up works based on the original Vision Transformer (ViT) by Google.

This repository contains an overview of important follow-up works based on the original Vision Transformer (ViT) by Google.

75 Dec 02, 2022
JittorVis - Visual understanding of deep learning models

JittorVis: Visual understanding of deep learning model JittorVis is an open-source library for understanding the inner workings of Jittor models by vi

thu-vis 182 Jan 06, 2023
Official repository for "Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems"

Action-Based Conversations Dataset (ABCD) This respository contains the code and data for ABCD (Chen et al., 2021) Introduction Whereas existing goal-

ASAPP Research 49 Oct 09, 2022
Code for Learning Manifold Patch-Based Representations of Man-Made Shapes, in ICLR 2021.

LearningPatches | Webpage | Paper | Video Learning Manifold Patch-Based Representations of Man-Made Shapes Dmitriy Smirnov, Mikhail Bessmeltsev, Justi

Dima Smirnov 22 Nov 14, 2022
This is a repository of our model for weakly-supervised video dense anticipation.

Introduction This is a repository of our model for weakly-supervised video dense anticipation. More results on GTEA, Epic-Kitchens etc. will come soon

2 Apr 09, 2022
Dense Unsupervised Learning for Video Segmentation (NeurIPS*2021)

Dense Unsupervised Learning for Video Segmentation This repository contains the official implementation of our paper: Dense Unsupervised Learning for

Visual Inference Lab @TU Darmstadt 173 Dec 26, 2022
Official implementation of Rich Semantics Improve Few-Shot Learning (BMVC, 2021)

Rich Semantics Improve Few-Shot Learning Paper Link Abstract : Human learning benefits from multi-modal inputs that often appear as rich semantics (e.

Mohamed Afham 11 Jul 26, 2022
Enhancing Column Generation by a Machine-Learning-BasedPricing Heuristic for Graph Coloring

Enhancing Column Generation by a Machine-Learning-BasedPricing Heuristic for Graph Coloring (to appear at AAAI 2022) We propose a machine-learning-bas

YunzhuangS 2 May 02, 2022
A PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing"

A PyTorch implementation of "Pathfinder Discovery Networks for Neural Message Passing" (WebConf 2021). Abstract In this work we propose Pathfind

Benedek Rozemberczki 49 Dec 01, 2022
Replication of Pix2Seq with Pretrained Model

Pretrained-Pix2Seq We provide the pre-trained model of Pix2Seq. This version contains new data augmentation. The model is trained for 300 epochs and c

peng gao 51 Nov 22, 2022