Pytorch implementation of TailCalibX : Feature Generation for Long-tail Classification

Overview

TailCalibX : Feature Generation for Long-tail Classification

by Rahul Vigneswaran, Marc T. Law, Vineeth N. Balasubramanian, Makarand Tapaswi

[arXiv] [Code] [pip Package] [Video] TailCalibX methodology

Table of contents

๐Ÿฃ Easy Usage (Recommended way to use our method)

โš  Caution: TailCalibX is just TailCalib employed multiple times. Specifically, we generate a set of features once every epoch and use them to train the classifier. In order to mimic that, three things must be done at every epoch in the following order:

  1. Collect all the features from your dataloader.
  2. Use the tailcalib package to make the features balanced by generating samples.
  3. Train the classifier.
  4. Repeat.

๐Ÿ’ป Installation

Use the package manager pip to install tailcalib.

pip install tailcalib

๐Ÿ‘จโ€๐Ÿ’ป Example Code

Check the instruction here for a much more detailed python package information.

# Import
from tailcalib import tailcalib

# Initialize
a = tailcalib(base_engine="numpy")   # Options: "numpy", "pytorch"

# Imbalanced random fake data
import numpy as np
X = np.random.rand(200,100)
y = np.random.randint(0,10, (200,))

# Balancing the data using "tailcalib"
feat, lab, gen = a.generate(X=X, y=y)

# Output comparison
print(f"Before: {np.unique(y, return_counts=True)}")
print(f"After: {np.unique(lab, return_counts=True)}")

๐Ÿงช Advanced Usage

โœ” Things to do before you run the code from this repo

  • Change the data_root for your dataset in main.py.
  • If you are using wandb logging (Weights & Biases), make sure to change the wandb.init in main.py accordingly.

๐Ÿ“€ How to use?

  • For just the methods proposed in this paper :
    • For CIFAR100-LT: run_TailCalibX_CIFAR100-LT.sh
    • For mini-ImageNet-LT : run_TailCalibX_mini-ImageNet-LT.sh
  • For all the results show in the paper :
    • For CIFAR100-LT: run_all_CIFAR100-LT.sh
    • For mini-ImageNet-LT : run_all_mini-ImageNet-LT.sh

๐Ÿ“š How to create the mini-ImageNet-LT dataset?

Check Notebooks/Create_mini-ImageNet-LT.ipynb for the script that generates the mini-ImageNet-LT dataset with varying imbalance ratios and train-test-val splits.

โš™ Arguments

  • --seed : Select seed for fixing it.

    • Default : 1
  • --gpu : Select the GPUs to be used.

    • Default : "0,1,2,3"
  • --experiment: Experiment number (Check 'libs/utils/experiment_maker.py').

    • Default : 0.1
  • --dataset : Dataset number.

    • Choices : 0 - CIFAR100, 1 - mini-imagenet
    • Default : 0
  • --imbalance : Select Imbalance factor.

    • Choices : 0: 1, 1: 100, 2: 50, 3: 10
    • Default : 1
  • --type_of_val : Choose which dataset split to use.

    • Choices: "vt": val_from_test, "vtr": val_from_train, "vit": val_is_test
    • Default : "vit"
  • --cv1 to --cv9 : Custom variable to use in experiments - purpose changes according to the experiment.

    • Default : "1"
  • --train : Run training sequence

    • Default : False
  • --generate : Run generation sequence

    • Default : False
  • --retraining : Run retraining sequence

    • Default : False
  • --resume : Will resume from the 'latest_model_checkpoint.pth' and wandb if applicable.

    • Default : False
  • --save_features : Collect feature representations.

    • Default : False
  • --save_features_phase : Dataset split of representations to collect.

    • Choices : "train", "val", "test"
    • Default : "train"
  • --config : If you have a yaml file with appropriate config, provide the path here. Will override the 'experiment_maker'.

    • Default : None

๐Ÿ‹๏ธโ€โ™‚๏ธ Trained weights

Experiment CIFAR100-LT (ResNet32, seed 1, Imb 100) mini-ImageNet-LT (ResNeXt50)
TailCalib Git-LFS Git-LFS
TailCalibX Git-LFS Git-LFS
CBD + TailCalibX Git-LFS Git-LFS

๐Ÿช€ Results on a Toy Dataset

Open In Colab

The higher the Imb ratio, the more imbalanced the dataset is. Imb ratio = maximum_sample_count / minimum_sample_count.

Check this notebook to play with the toy example from which the plot below was generated.

๐ŸŒด Directory Tree

TailCalibX
โ”œโ”€โ”€ libs
โ”‚   โ”œโ”€โ”€ core
โ”‚   โ”‚   โ”œโ”€โ”€ ce.py
โ”‚   โ”‚   โ”œโ”€โ”€ core_base.py
โ”‚   โ”‚   โ”œโ”€โ”€ ecbd.py
โ”‚   โ”‚   โ”œโ”€โ”€ modals.py
โ”‚   โ”‚   โ”œโ”€โ”€ TailCalib.py
โ”‚   โ”‚   โ””โ”€โ”€ TailCalibX.py
โ”‚   โ”œโ”€โ”€ data
โ”‚   โ”‚   โ”œโ”€โ”€ dataloader.py
โ”‚   โ”‚   โ”œโ”€โ”€ ImbalanceCIFAR.py
โ”‚   โ”‚   โ””โ”€โ”€ mini-imagenet
โ”‚   โ”‚       โ”œโ”€โ”€ 0.01_test.txt
โ”‚   โ”‚       โ”œโ”€โ”€ 0.01_train.txt
โ”‚   โ”‚       โ””โ”€โ”€ 0.01_val.txt
โ”‚   โ”œโ”€โ”€ loss
โ”‚   โ”‚   โ”œโ”€โ”€ CosineDistill.py
โ”‚   โ”‚   โ””โ”€โ”€ SoftmaxLoss.py
โ”‚   โ”œโ”€โ”€ models
โ”‚   โ”‚   โ”œโ”€โ”€ CosineDotProductClassifier.py
โ”‚   โ”‚   โ”œโ”€โ”€ DotProductClassifier.py
โ”‚   โ”‚   โ”œโ”€โ”€ ecbd_converter.py
โ”‚   โ”‚   โ”œโ”€โ”€ ResNet32Feature.py
โ”‚   โ”‚   โ”œโ”€โ”€ ResNext50Feature.py
โ”‚   โ”‚   โ””โ”€โ”€ ResNextFeature.py
โ”‚   โ”œโ”€โ”€ samplers
โ”‚   โ”‚   โ””โ”€โ”€ ClassAwareSampler.py
โ”‚   โ””โ”€โ”€ utils
โ”‚       โ”œโ”€โ”€ Default_config.yaml
โ”‚       โ”œโ”€โ”€ experiments_maker.py
โ”‚       โ”œโ”€โ”€ globals.py
โ”‚       โ”œโ”€โ”€ logger.py
โ”‚       โ””โ”€โ”€ utils.py
โ”œโ”€โ”€ LICENSE
โ”œโ”€โ”€ main.py
โ”œโ”€โ”€ Notebooks
โ”‚   โ”œโ”€โ”€ Create_mini-ImageNet-LT.ipynb
โ”‚   โ””โ”€โ”€ toy_example.ipynb
โ”œโ”€โ”€ readme_assets
โ”‚   โ”œโ”€โ”€ method.svg
โ”‚   โ””โ”€โ”€ toy_example_output.svg
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ run_all_CIFAR100-LT.sh
โ”œโ”€โ”€ run_all_mini-ImageNet-LT.sh
โ”œโ”€โ”€ run_TailCalibX_CIFAR100-LT.sh
โ””โ”€โ”€ run_TailCalibX_mini-imagenet-LT.sh

Ignored tailcalib_pip as it is for the tailcalib pip package.

๐Ÿ“ƒ Citation

@inproceedings{rahul2021tailcalibX,
    title   = {{Feature Generation for Long-tail Classification}},
    author  = {Rahul Vigneswaran and Marc T. Law and Vineeth N. Balasubramanian and Makarand Tapaswi},
    booktitle = {ICVGIP},
    year = {2021}
}

๐Ÿ‘ Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

โค About me

Rahul Vigneswaran

โœจ Extras

๐Ÿ Long-tail buzz : If you are interested in deep learning research which involves long-tailed / imbalanced dataset, take a look at Long-tail buzz to learn about the recent trending papers in this field.

๐Ÿ“ License

MIT

Owner
Rahul Vigneswaran
Rahul Vigneswaran
Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, Daniel Silva, Andrew McCallum, Amr Ahmed. KDD 2019.

gHHC Code for: Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space. Nicholas Monath, Manzil Zaheer, D

Nicholas Monath 35 Nov 16, 2022
Basics of 2D and 3D Human Pose Estimation.

Human Pose Estimation 101 If you want a slightly more rigorous tutorial and understand the basics of Human Pose Estimation and how the field has evolv

Sudharshan Chandra Babu 293 Dec 14, 2022
Repository for GNSS-based position estimation using a Deep Neural Network

Code repository accompanying our work on 'Improving GNSS Positioning using Neural Network-based Corrections'. In this paper, we present a Deep Neural

32 Dec 13, 2022
KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)

KoGPT KoGPT (Korean Generative Pre-trained Transformer) https://github.com/kakaobrain/kogpt https://huggingface.co/kakaobrain/kogpt Model Descriptions

Kakao Brain 799 Dec 28, 2022
A tiny, friendly, strong baseline code for Person-reID (based on pytorch).

Pytorch ReID Strong, Small, Friendly A tiny, friendly, strong baseline code for Person-reID (based on pytorch). Strong. It is consistent with the new

Zhedong Zheng 3.5k Jan 08, 2023
CV backbones including GhostNet, TinyNet and TNT, developed by Huawei Noah's Ark Lab.

CV Backbones including GhostNet, TinyNet, TNT (Transformer in Transformer) developed by Huawei Noah's Ark Lab. GhostNet Code TinyNet Code TNT Code Pyr

HUAWEI Noah's Ark Lab 3k Jan 08, 2023
PushForKiCad - AISLER Push for KiCad EDA

AISLER Push for KiCad Push your layout to AISLER with just one click for instant

AISLER 31 Dec 29, 2022
pytorch implementation of openpose including Hand and Body Pose Estimation.

pytorch-openpose pytorch implementation of openpose including Body and Hand Pose Estimation, and the pytorch model is directly converted from openpose

Hzzone 1.4k Jan 07, 2023
The authors' official PyTorch SigWGAN implementation

The authors' official PyTorch SigWGAN implementation This repository is the official implementation of [Sig-Wasserstein GANs for Time Series Generatio

9 Jun 16, 2022
Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

SSC-GAN_repo Pytorch implementation for 'Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation'.PDF SSC-GAN:Sem

tyty 4 Aug 28, 2022
Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.

Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.

Octavio Arriaga 5.3k Dec 30, 2022
Weakly- and Semi-Supervised Panoptic Segmentation (ECCV18)

Weakly- and Semi-Supervised Panoptic Segmentation by Qizhu Li*, Anurag Arnab*, Philip H.S. Torr This repository demonstrates the weakly supervised gro

Qizhu Li 159 Dec 20, 2022
A PyTorch implementation of EventProp [https://arxiv.org/abs/2009.08378], a method to train Spiking Neural Networks

Spiking Neural Network training with EventProp This is an unofficial PyTorch implemenation of EventProp, a method to compute exact gradients for Spiki

Pedro Savarese 35 Jul 29, 2022
Official implementation for "Low-light Image Enhancement via Breaking Down the Darkness"

Low-light Image Enhancement via Breaking Down the Darkness by Qiming Hu, Xiaojie Guo. 1. Dependencies Python3 PyTorch=1.0 OpenCV-Python, TensorboardX

Qiming Hu 30 Jan 01, 2023
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

PortaSpeech - PyTorch Implementation PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech. Model Size Module Nor

Keon Lee 279 Jan 04, 2023
A python code to convert Keras pre-trained weights to Pytorch version

Weights_Keras_2_Pytorch ๆœ€่ฟ‘ๆƒณๅœจPytorch้กน็›ฎ้‡Œไฝฟ็”จไธ€ไธ‹่ฐทๆญŒ็š„NIMA๏ผŒไฝ†ๆ˜ฏๅ‘็Žฐๆฒกๆœ‰้ข„่ฎญ็ปƒๅฅฝ็š„pytorchๆƒ้‡๏ผŒไบŽๆ˜ฏๆ•ด็†ไบ†ไธ€ไธ‹ๅฐ†Keras้ข„่ฎญ็ปƒๆƒ้‡่ฝฌไธบPytorch็š„ไปฃ็ ๏ผŒ็›ฎๅ‰ๆ˜ฏๆ”ฏๆŒKeras็š„Conv2D, Dense, DepthwiseConv2D, Batch

Liu Hengyu 2 Dec 16, 2021
Codebase of deep learning models for inferring stability of mRNA molecules

Kaggle OpenVaccine Models Codebase of deep learning models for inferring stability of mRNA molecules, corresponding to the Kaggle Open Vaccine Challen

Eternagame 40 Dec 29, 2022
Optimal Camera Position for a Practical Application of Gaze Estimation on Edge Devices,

Optimal Camera Position for a Practical Application of Gaze Estimation on Edge Devices, Linh Van Ma, Tin Trung Tran, Moongu Jeon, ICAIIC 2022 (The 4th

Linh 11 Oct 10, 2022
PyTorch implementation of probabilistic deep forecast applied to air quality.

Probabilistic Deep Forecast PyTorch implementation of a paper, titled: Probabilistic Deep Learning to Quantify Uncertainty in Air Quality Forecasting

Abdulmajid Murad 13 Nov 16, 2022
Range Image-based LiDAR Localization for Autonomous Vehicles Using Mesh Maps

Range Image-based 3D LiDAR Localization This repo contains the code for our ICRA2021 paper: Range Image-based LiDAR Localization for Autonomous Vehicl

Photogrammetry & Robotics Bonn 208 Dec 15, 2022