[ICML 2021, Long Talk] Delving into Deep Imbalanced Regression

Last update: Dec 30, 2022

Overview

Delving into Deep Imbalanced Regression

This repository contains the implementation code for paper:
Delving into Deep Imbalanced Regression
Yuzhe Yang, Kaiwen Zha, Ying-Cong Chen, Hao Wang, Dina Katabi
38th International Conference on Machine Learning (ICML 2021), Long Oral
[Project Page] [Paper] [Video] [Blog Post]

Deep Imbalanced Regression (DIR) aims to learn from imbalanced data with continuous targets,
tackle potential missing data for certain regions, and generalize to the entire target range.

Beyond Imbalanced Classification: Brief Introduction for DIR

Existing techniques for learning from imbalanced data focus on targets with categorical indices, i.e., the targets are different classes. However, many real-world tasks involve continuous and even infinite target values. We systematically investigate Deep Imbalanced Regression (DIR), which aims to learn continuous targets from natural imbalanced data, deal with potential missing data for certain target values, and generalize to the entire target range.

We curate and benchmark large-scale DIR datasets for common real-world tasks in computer vision, natural language processing, and healthcare domains, ranging from single-value prediction such as age, text similarity score, health condition score, to dense-value prediction such as depth.

Usage

We separate the codebase for different datasets into different subfolders. Please go into the subfolders for more information (e.g., installation, dataset preparation, training, evaluation & models).

IMDB-WIKI-DIR | AgeDB-DIR | NYUD2-DIR | STS-B-DIR

Highlights

(1) ✔️ New Task: Deep Imbalanced Regression (DIR)

(2) ✔️ New Techniques:


Label distribution smoothing (LDS)	Feature distribution smoothing (FDS)

(3) ✔️ New Benchmarks:

Computer Vision: 💡 IMDB-WIKI-DIR (age) / AgeDB-DIR (age) / NYUD2-DIR (depth)
Natural Language Processing: 📋 STS-B-DIR (text similarity score)
Healthcare: 🏥 SHHS-DIR (health condition score)

IMDB-WIKI-DIR	AgeDB-DIR	NYUD2-DIR	STS-B-DIR	SHHS-DIR

Updates

[06/2021] We provide a hands-on tutorial of DIR. Check it out!
[05/2021] We create a Blog post for this work (version in Chinese is also available here). Check it out for more details!
[05/2021] Paper accepted to ICML 2021 as a Long Talk. We have released the code and models. You can find all reproduced checkpoints via this link, or go into each subfolder for models for each dataset.
[02/2021] arXiv version posted. Please stay tuned for updates.

Citation

If you find this code or idea useful, please cite our work:

@inproceedings{yang2021delving,
  title={Delving into Deep Imbalanced Regression},
  author={Yang, Yuzhe and Zha, Kaiwen and Chen, Ying-Cong and Wang, Hao and Katabi, Dina},
  booktitle={International Conference on Machine Learning (ICML)},
  year={2021}
}

Contact

If you have any questions, feel free to contact us through email ([email protected] & [email protected]) or Github issues. Enjoy!

[ICML 2021, Long Talk] Delving into Deep Imbalanced Regression

Related tags

Overview

Delving into Deep Imbalanced Regression

Beyond Imbalanced Classification: Brief Introduction for DIR

Usage

IMDB-WIKI-DIR | AgeDB-DIR | NYUD2-DIR | STS-B-DIR

Highlights

Updates

Citation

Contact

Owner

Yuzhe Yang

A unified 3D Transformer Pipeline for visual synthesis

Data and code for the paper "Importance of Kernel Bandwidth in Quantum Machine Learning"

Boston House Prediction Valuation Tool

Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020).

EigenGAN Tensorflow, EigenGAN: Layer-Wise Eigen-Learning for GANs

SNIPS: Solving Noisy Inverse Problems Stochastically

This is the face keypoint train code of project face-detection-project

Lightweight, Python library for fast and reproducible experimentation :microscope:

Attention-based CNN-LSTM and XGBoost hybrid model for stock prediction

Mmdet benchmark with python

A public available dataset for road boundary detection in aerial images

Code for our SIGCOMM'21 paper "Network Planning with Deep Reinforcement Learning".

VD-BERT: A Unified Vision and Dialog Transformer with BERT

Dynamical Wasserstein Barycenters for Time Series Modeling

HybVIO visual-inertial odometry and SLAM system

An introduction to satellite image analysis using Python + OpenCV and JavaScript + Google Earth Engine

Code and description for my BSc Project, September 2021

Implementation for On Provable Benefits of Depth in Training Graph Convolutional Networks

Implementation of ToeplitzLDA for spatiotemporal stationary time series data.

Code for "Universal inference meets random projections: a scalable test for log-concavity"