[ICLR 2021 Spotlight] Pytorch implementation for "Long-tailed Recognition by Routing Diverse Distribution-Aware Experts."

Overview

RIDE: Long-tailed Recognition by Routing Diverse Distribution-Aware Experts.

by Xudong Wang, Long Lian, Zhongqi Miao, Ziwei Liu and Stella X. Yu at UC Berkeley/ICSI and NTU

International Conference on Learning Representations (ICLR), 2021. Spotlight Presentation

Project Page | PDF | Preprint | OpenReview | Slides | Citation

This repository contains an official re-implementation of RIDE from the authors, while also has plans to support other works on long-tailed recognition. Further information please contact Xudong Wang and Long Lian.

Citation

If you find our work inspiring or use our codebase in your research, please consider giving a star and a citation.

@inproceedings{wang2021longtailed,
  title={Long-tailed Recognition by Routing Diverse Distribution-Aware Experts},
  author={Xudong Wang and Long Lian and Zhongqi Miao and Ziwei Liu and Stella Yu},
  booktitle={International Conference on Learning Representations},
  year={2021},
  url={https://openreview.net/forum?id=D9I3drBz4UC}
}

Supported Methods for Long-tailed Recognition:

  • RIDE
  • Cross-Entropy (CE) Loss
  • Focal Loss
  • LDAM Loss
  • Decouple: cRT (limited support for now)
  • Decouple: tau-normalization (limited support for now)

Updates

[04/2021] Pre-trained models are avaliable in model zoo.

[12/2020] We added an approximate GFLops counter. See usages below. We also refactored the code and fixed a few errors.

[12/2020] We have limited support on cRT and tau-norm in load_stage1 option and t-normalization.py, please look at the code comments for instructions while we are still working on it.

[12/2020] Initial Commit. We re-implemented RIDE in this repo. LDAM/Focal/Cross-Entropy loss is also re-implemented (instruction below).

Table of contents

Requirements

Packages

  • Python >= 3.7, < 3.9
  • PyTorch >= 1.6
  • tqdm (Used in test.py)
  • tensorboard >= 1.14 (for visualization)
  • pandas
  • numpy

Hardware requirements

8 GPUs with >= 11G GPU RAM are recommended. Otherwise the model with more experts may not fit in, especially on datasets with more classes (the FC layers will be large). We do not support CPU training, but CPU inference could be supported by slight modification.

Dataset Preparation

CIFAR code will download data automatically with the dataloader. We use data the same way as classifier-balancing. For ImageNet-LT and iNaturalist, please prepare data in the data directory. ImageNet-LT can be found at this link. iNaturalist data should be the 2018 version from this repo (Note that it requires you to pay to download now). The annotation can be found at here. Please put them in the same location as below:

data
├── cifar-100-python
│   ├── file.txt~
│   ├── meta
│   ├── test
│   └── train
├── cifar-100-python.tar.gz
├── ImageNet_LT
│   ├── ImageNet_LT_open.txt
│   ├── ImageNet_LT_test.txt
│   ├── ImageNet_LT_train.txt
│   ├── ImageNet_LT_val.txt
│   ├── test
│   ├── train
│   └── val
└── iNaturalist18
    ├── iNaturalist18_train.txt
    ├── iNaturalist18_val.txt
    └── train_val2018

How to get pretrained checkpoints

We have a model zoo available.

Training and Evaluation Instructions

Imbalanced CIFAR 100/CIFAR100-LT

RIDE Without Distill (Stage 1)
python train.py -c "configs/config_imbalance_cifar100_ride.json" --reduce_dimension 1 --num_experts 3

Note: --reduce_dimension 1 means set reduce dimension to True. The template has an issue with bool arguments so int argument is used here. However, any non-zero value will be equivalent to bool True.

RIDE With Distill (Stage 1)
python train.py -c "configs/config_imbalance_cifar100_distill_ride.json" --reduce_dimension 1 --num_experts 3 --distill_checkpoint path_to_checkpoint

Distillation is not required but could be performed if you'd like further improvements.

RIDE Expert Assignment Module Training (Stage 2)
python train.py -c "configs/config_imbalance_cifar100_ride_ea.json" -r path_to_stage1_checkpoint --reduce_dimension 1 --num_experts 3

Note: different runs will result in different EA modules with different trade-off. Some modules give higher accuracy but require higher FLOps. Although the only difference is not underlying ability to classify but the "easiness to satisfy and stop". You can tune the pos_weight if you think the EA module consumes too much compute power or is using too few expert.

ImageNet-LT

RIDE Without Distill (Stage 1)

ResNet 10
python train.py -c "configs/config_imagenet_lt_resnet10_ride.json" --reduce_dimension 1 --num_experts 3
ResNet 50
python train.py -c "configs/config_imagenet_lt_resnet50_ride.json" --reduce_dimension 1 --num_experts 3
ResNeXt 50
python train.py -c "configs/config_imagenet_lt_resnext50_ride.json" --reduce_dimension 1 --num_experts 3

RIDE With Distill (Stage 1)

ResNet 10
python train.py -c "configs/config_imagenet_lt_resnet10_distill_ride.json" --reduce_dimension 1 --num_experts 3 --distill_checkpoint path_to_checkpoint
ResNet 50
python train.py -c "configs/config_imagenet_lt_resnet50_distill_ride.json" --reduce_dimension 1 --num_experts 3 --distill_checkpoint path_to_checkpoint
ResNeXt 50
python train.py -c "configs/config_imagenet_lt_resnext50_distill_ride.json" --reduce_dimension 1 --num_experts 3 --distill_checkpoint path_to_checkpoint

RIDE Expert Assignment Module Training (Stage 2)

ResNet 10
python train.py -c "configs/config_imagenet_lt_resnet10_ride_ea.json" -r path_to_stage1_checkpoint --reduce_dimension 1 --num_experts 3
ResNet 50
python train.py -c "configs/config_imagenet_lt_resnet50_ride_ea.json" -r path_to_stage1_checkpoint --reduce_dimension 1 --num_experts 3
ResNeXt 50
python train.py -c "configs/config_imagenet_lt_resnext50_ride_ea.json" -r path_to_stage1_checkpoint --reduce_dimension 1 --num_experts 3

iNaturalist

RIDE Without Distill (Stage 1)

python train.py -c "configs/config_iNaturalist_resnet50_ride.json" --reduce_dimension 1 --num_experts 3

RIDE With Distill (Stage 1)

python train.py -c "configs/config_iNaturalist_resnet50_distill_ride.json" --reduce_dimension 1 --num_experts 3 --distill_checkpoint path_to_checkpoint

RIDE Expert Assignment Module Training (Stage 2)

python train.py -c "configs/config_iNaturalist_resnet50_ride_ea.json" -r path_to_stage1_checkpoint --reduce_dimension 1 --num_experts 3

Using Other Methods with RIDE

  • Focal Loss: switch the loss to Focal Loss
  • Cross Entropy: switch the loss to Cross Entropy Loss

Test

To test a checkpoint, please put it with the corresponding config file.

python test.py -r path_to_checkpoint

Please see the pytorch template that we use for additional more general usages of this project (e.g. loading from a checkpoint, etc.).

GFLops calculation

We provide an experimental support for approximate GFLops calculation. Please open an issue if you encounter any problem or meet inconsistency in GFLops.

You need to install thop package first. Then, according to your model, run python -m utils.gflops (args) in the project directory.

Examples and explanations

Use python -m utils.gflops to see the documents as well as explanations for this calculator.

ImageNet-LT
python -m utils.gflops ResNeXt50Model 0 --num_experts 3 --reduce_dim True --use_norm False

To change model, switch ResNeXt50Model to the ones used in your config. use_norm comes with LDAM-based methods (including RIDE). reduce_dim is used in default RIDE models. The 0 in the command line indicates the dataset.

All supported datasets:

  • 0: ImageNet-LT
  • 1: iNaturalist
  • 2: Imbalance CIFAR 100
iNaturalist
python -m utils.gflops ResNet50Model 1 --num_experts 3 --reduce_dim True --use_norm True
Imbalance CIFAR 100
python -m utils.gflops ResNet32Model 2 --num_experts 3 --reduce_dim True --use_norm True
Special circumstances: calculate the approximate GFLops in models with expert assignment module

We provide a ea_percentage for specifying the percentage of data that pass each expert. Note that you need to switch to the EA model as well since you actually use EA model instead of the original model in training and inference.

An example:

python -m utils.gflops ResNet32EAModel 2 --num_experts 3 --reduce_dim True --use_norm True --ea_percentage 40.99,9.47,49.54

FAQ

See FAQ.

How to get support from us?

If you have any general questions, feel free to email us at longlian at berkeley.edu and xdwang at eecs.berkeley.edu. If you have code or implementation-related questions, please feel free to send emails to us or open an issue in this codebase (We recommend that you open an issue in this codebase, because your questions may help others).

Pytorch template

This is a project based on this pytorch template. The readme of the template explains its functionality, although we try to list most frequently used ones in this readme.

License

This project is licensed under the MIT License. See LICENSE for more details. The parts described below follow their original license.

Acknowledgements

This is a project based on this pytorch template. The pytorch template is inspired by the project Tensorflow-Project-Template by Mahmoud Gemy

The ResNet and ResNeXt in fb_resnets are based on from Classifier-Balancing/Decouple. The ResNet in ldam_drw_resnets/LDAM loss/CIFAR-LT are based on LDAM-DRW. KD implementation takes references from CRD/RepDistiller.

Owner
Xudong (Frank) Wang
Ph.D. Student @ EECS, UC Berkeley; Graduate Student Researcher @ International Computer Science Institute, Berkeley, USA
Xudong (Frank) Wang
Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021

Mask-Align: Self-Supervised Neural Word Alignment This is the implementation of our work Mask-Align: Self-Supervised Neural Word Alignment. @inproceed

THUNLP-MT 46 Dec 15, 2022
Simple Speech to Text, Text to Speech

Simple Speech to Text, Text to Speech 1. Download Repository Opsi 1 Download repository ini, extract di lokasi yang diinginkan Opsi 2 Jika sudah famil

Habib Abdurrasyid 5 Dec 28, 2021
Multi Task Vision and Language

12-in-1: Multi-Task Vision and Language Representation Learning Please cite the following if you use this code. Code and pre-trained models for 12-in-

Meta Research 711 Jan 08, 2023
This repo stores the codes for topic modeling on palliative care journals.

This repo stores the codes for topic modeling on palliative care journals. Data Preparation You first need to download the journal papers. bash 1_down

3 Dec 20, 2022
A PyTorch Implementation of End-to-End Models for Speech-to-Text

speech Speech is an open-source package to build end-to-end models for automatic speech recognition. Sequence-to-sequence models with attention, Conne

Awni Hannun 647 Dec 25, 2022
A PyTorch implementation of the Transformer model in "Attention is All You Need".

Attention is all you need: A Pytorch Implementation This is a PyTorch implementation of the Transformer model in "Attention is All You Need" (Ashish V

Yu-Hsiang Huang 7.1k Jan 05, 2023
This is a MD5 password/passphrase brute force tool

CROWES-PASS-CRACK-TOOl This is a MD5 password/passphrase brute force tool How to install: Do 'git clone https://github.com/CROW31/CROWES-PASS-CRACK-TO

9 Mar 02, 2022
this repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

uber-pickups-analysis Data Source: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city Information about data set The dataset contain

1 Nov 02, 2021
The (extremely) naive sentiment classification function based on NBSVM trained on wisesight_sentiment

thai_sentiment The naive sentiment classification function based on NBSVM trained on wisesight_sentiment วิธีติดตั้ง pip install thai_sentiment==0.1.3

Charin 7 Dec 08, 2022
Azure Text-to-speech service for Home Assistant

Azure Text-to-speech service for Home Assistant The Azure text-to-speech platform uses online Azure Text-to-Speech cognitive service to read a text wi

Yassine Selmi 2 Aug 06, 2022
Indobenchmark are collections of Natural Language Understanding (IndoNLU) and Natural Language Generation (IndoNLG)

Indobenchmark Toolkit Indobenchmark are collections of Natural Language Understanding (IndoNLU) and Natural Language Generation (IndoNLG) resources fo

Samuel Cahyawijaya 11 Aug 26, 2022
[ICCV 2021] Instance-level Image Retrieval using Reranking Transformers

Instance-level Image Retrieval using Reranking Transformers Fuwen Tan, Jiangbo Yuan, Vicente Ordonez, ICCV 2021. Abstract Instance-level image retriev

UVA Computer Vision 86 Dec 28, 2022
LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

LV-BERT Introduction In this repo, we introduce LV-BERT by exploiting layer variety for BERT. For detailed description and experimental results, pleas

Weihao Yu 14 Aug 24, 2022
Transformer - A TensorFlow Implementation of the Transformer: Attention Is All You Need

[UPDATED] A TensorFlow Implementation of Attention Is All You Need When I opened this repository in 2017, there was no official code yet. I tried to i

Kyubyong Park 3.8k Dec 26, 2022
Practical Natural Language Processing Tools for Humans is build on the top of Senna Natural Language Processing (NLP)

Practical Natural Language Processing Tools for Humans is build on the top of Senna Natural Language Processing (NLP) predictions: part-of-speech (POS) tags, chunking (CHK), name entity recognition (

jawahar 20 Apr 30, 2022
Topic Inference with Zeroshot models

zeroshot_topics Table of Contents Installation Usage License Installation zeroshot_topics is distributed on PyPI as a universal wheel and is available

Rita Anjana 55 Nov 28, 2022
Generate custom detailed survey paper with topic clustered sections and proper citations, from just a single query in just under 30 mins !!

Auto-Research A no-code utility to generate a detailed well-cited survey with topic clustered sections (draft paper format) and other interesting arti

Sidharth Pal 20 Dec 14, 2022
⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x using fastT5.

Reduce T5 model size by 3X and increase the inference speed up to 5X. Install Usage Details Functionalities Benchmarks Onnx model Quantized onnx model

Kiran R 399 Jan 05, 2023
A unified tokenization tool for Images, Chinese and English.

ICE Tokenizer Token id [0, 20000) are image tokens. Token id [20000, 20100) are common tokens, mainly punctuations. E.g., icetk[20000] == 'unk', ice

THUDM 42 Dec 27, 2022
Neural text generators like the GPT models promise a general-purpose means of manipulating texts.

Boolean Prompting for Neural Text Generators Neural text generators like the GPT models promise a general-purpose means of manipulating texts. These m

Jeffrey M. Binder 20 Jan 09, 2023