Repository for Multimodal AutoML Benchmark

Last update: Nov 24, 2022

Overview

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

Repository for the NeurIPS 2021 Dataset Track Submission "Benchmarking Multimodal AutoML for Tabular Data with Text Fields" (Link, Full Paper with Appendix). An earlier version of the paper, called "Multimodal AutoML on Structured Tables with Text Fields" (Link) has been accepted by ICML 2021 AutoML workshop as Oral. As we have since updated the benchmark with more datasets, the version used in the AutoML workshop paper has been archived at the icml_workshop branch.

This benchmark contains a diverse collection of tabular datasets. Each dataset contains numeric/categorical as well as text columns. The goal is to evaluate the performance of (automated) ML systems for supervised learning (classification and regression) with such multimodal data. The folder multimodal_text_benchmark/scripts/benchmark/ provides Python scripts to run different variants of the AutoGluon and H2O AutoML tools on the benchmark.

Datasets used in the Benchmark

Here's a brief summary of the datasets in our benchmark. Each dataset is described in greater detail in the multimodal_text_benchmark/ folder.

ID	key	#Train	#Test	Task	Metric	Prediction Target
prod	product_sentiment_machine_hack	5,091	1,273	multiclass	accuracy	sentiment related to product
salary	data_scientist_salary	15,84	3961	multiclass	accuracy	salary range in data scientist job listings
airbnb	melbourne_airbnb	18,316	4,579	multiclass	accuracy	price of Airbnb listing
channel	news_channel	20,284	5,071	multiclass	accuracy	category of news article
wine	wine_reviews	84,123	21,031	multiclass	accuracy	variety of wine
imdb	imdb_genre_prediction	800	200	binary	roc_auc	whether film is a drama
fake	fake_job_postings2	12,725	3,182	binary	roc_auc	whether job postings are fake
kick	kick_starter_funding	86,052	21,626	binary	roc_auc	will Kickstarter get funding
jigsaw	jigsaw_unintended_bias100K	100,000	25,000	binary	roc_auc	whether comments are toxic
qaa	google_qa_answer_type_reason_explanation	4,863	1,216	regression	r2	type of answer
qaq	google_qa_question_type_reason_explanation	4,863	1,216	regression	r2	type of question
book	bookprice_prediction	4,989	1,248	regression	r2	price of books
jc	jc_penney_products	10,860	2,715	regression	r2	price of JC Penney products
cloth	women_clothing_review	18,788	4,698	regression	r2	review score
ae	ae_price_prediction	22,662	5,666	regression	r2	American-Eagle item prices
pop	news_popularity2	24,007	6,002	regression	r2	news article popularity online
house	california_house_price	24,007	6,002	regression	r2	sale price of houses in California
mercari	mercari_price_suggestion100K	100,000	25,000	regression	r2	price of Mercari products

License

The versions of datasets in this benchmark are released under the CC BY-NC-SA license. Note that the datasets in this benchmark are modified versions of previously publicly-available original copies and we do not own any of the datasets in the benchmark. Any data from this benchmark which has previously been published elsewhere falls under the original license from which the data originated. Please refer to the licenses of each original source linked in the multimodal_text_benchmark/README.md.

Install the Benchmark Suite

cd multimodal_text_benchmark
# Install the benchmarking suite
python3 -m pip install -U -e .

You can do a quick test of the installation by going to the test folder

cd multimodal_text_benchmark/tests
python3 -m pytest test_datasets.py

To work with one of the datasets, use the following code:

from auto_mm_bench.datasets import dataset_registry

print(dataset_registry.list_keys())  # list of all dataset names
dataset_name = 'product_sentiment_machine_hack'

train_dataset = dataset_registry.create(dataset_name, 'train')
test_dataset = dataset_registry.create(dataset_name, 'test')
print(train_dataset.data)
print(test_dataset.data)

To access all datasets that comprise the benchmark:

from auto_mm_bench.datasets import create_dataset, TEXT_BENCHMARK_ALIAS_MAPPING

for dataset_name in list(TEXT_BENCHMARK_ALIAS_MAPPING.values()):
    print(dataset_name)
    dataset = create_dataset(dataset_name)

Run Experiments

Go to multimodal_text_benchmark/scripts/benchmark to see how to run some baseline ML methods over the benchmark.

References

BibTeX entry of the ICML Workshop Version:

@article{agmultimodaltext,
  title={Multimodal AutoML on Structured Tables with Text Fields},
  author={Shi, Xingjian and Mueller, Jonas and Erickson, Nick and Li, Mu and Smola, Alexander},
  journal={8th ICML Workshop on Automated Machine Learning (AutoML)},
  year={2021}
}

Repository for Multimodal AutoML Benchmark

Related tags

Overview

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

Datasets used in the Benchmark

License

Install the Benchmark Suite

Run Experiments

References

Owner

Xingjian Shi

Code for "MetaMorph: Learning Universal Controllers with Transformers", Gupta et al, ICLR 2022

CrossNorm and SelfNorm for Generalization under Distribution Shifts (ICCV 2021)

Unofficial Implementation of MLP-Mixer, Image Classification Model

PPLNN is a Primitive Library for Neural Network is a high-performance deep-learning inference engine for efficient AI inferencing

CCP dataset from Clothing Co-Parsing by Joint Image Segmentation and Labeling

PyTorch 1.5 implementation for paper DECOR-GAN: 3D Shape Detailization by Conditional Refinement.

Do Smart Glasses Dream of Sentimental Visions? Deep Emotionship Analysis for Eyewear Devices

「PyTorch Implementation of AnimeGANv2」を用いて、生成した顔画像を元の画像に上書きするデモ

Exploration-Exploitation Dilemma Solving Methods

This is the official code for the paper "Learning with Nested Scene Modeling and Cooperative Architecture Search for Low-Light Vision"

ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis

KIND: an Italian Multi-Domain Dataset for Named Entity Recognition

Repository for the paper : Meta-FDMixup: Cross-Domain Few-Shot Learning Guided byLabeled Target Data

A PyTorch implementation of "DGC-Net: Dense Geometric Correspondence Network"

The final project for "Applying AI to Wearable Device Data" course from "AI for Healthcare" - Udacity.

magiCARP: Contrastive Authoring+Reviewing Pretraining

This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021.

Neon: an add-on for Lightbulb making it easier to handle component interactions

Code for Graph-to-Tree Learning for Solving Math Word Problems (ACL 2020)

Demonstration of the Model Training as a CI/CD System in Vertex AI