Pytorch implementation of TailCalibX : Feature Generation for Long-tail Classification

Overview

TailCalibX : Feature Generation for Long-tail Classification

by Rahul Vigneswaran, Marc T. Law, Vineeth N. Balasubramanian, Makarand Tapaswi

[arXiv] [Code] [pip Package] [Video] TailCalibX methodology

Table of contents

🐣 Easy Usage (Recommended way to use our method)

⚠ Caution: TailCalibX is just TailCalib employed multiple times. Specifically, we generate a set of features once every epoch and use them to train the classifier. In order to mimic that, three things must be done at every epoch in the following order:

  1. Collect all the features from your dataloader.
  2. Use the tailcalib package to make the features balanced by generating samples.
  3. Train the classifier.
  4. Repeat.

πŸ’» Installation

Use the package manager pip to install tailcalib.

pip install tailcalib

πŸ‘¨β€πŸ’» Example Code

Check the instruction here for a much more detailed python package information.

# Import
from tailcalib import tailcalib

# Initialize
a = tailcalib(base_engine="numpy")   # Options: "numpy", "pytorch"

# Imbalanced random fake data
import numpy as np
X = np.random.rand(200,100)
y = np.random.randint(0,10, (200,))

# Balancing the data using "tailcalib"
feat, lab, gen = a.generate(X=X, y=y)

# Output comparison
print(f"Before: {np.unique(y, return_counts=True)}")
print(f"After: {np.unique(lab, return_counts=True)}")

πŸ§ͺ Advanced Usage

βœ” Things to do before you run the code from this repo

  • Change the data_root for your dataset in main.py.
  • If you are using wandb logging (Weights & Biases), make sure to change the wandb.init in main.py accordingly.

πŸ“€ How to use?

  • For just the methods proposed in this paper :
    • For CIFAR100-LT: run_TailCalibX_CIFAR100-LT.sh
    • For mini-ImageNet-LT : run_TailCalibX_mini-ImageNet-LT.sh
  • For all the results show in the paper :
    • For CIFAR100-LT: run_all_CIFAR100-LT.sh
    • For mini-ImageNet-LT : run_all_mini-ImageNet-LT.sh

πŸ“š How to create the mini-ImageNet-LT dataset?

Check Notebooks/Create_mini-ImageNet-LT.ipynb for the script that generates the mini-ImageNet-LT dataset with varying imbalance ratios and train-test-val splits.

βš™ Arguments

  • --seed : Select seed for fixing it.

    • Default : 1
  • --gpu : Select the GPUs to be used.

    • Default : "0,1,2,3"
  • --experiment: Experiment number (Check 'libs/utils/experiment_maker.py').

    • Default : 0.1
  • --dataset : Dataset number.

    • Choices : 0 - CIFAR100, 1 - mini-imagenet
    • Default : 0
  • --imbalance : Select Imbalance factor.

    • Choices : 0: 1, 1: 100, 2: 50, 3: 10
    • Default : 1
  • --type_of_val : Choose which dataset split to use.

    • Choices: "vt": val_from_test, "vtr": val_from_train, "vit": val_is_test
    • Default : "vit"
  • --cv1 to --cv9 : Custom variable to use in experiments - purpose changes according to the experiment.

    • Default : "1"
  • --train : Run training sequence

    • Default : False
  • --generate : Run generation sequence

    • Default : False
  • --retraining : Run retraining sequence

    • Default : False
  • --resume : Will resume from the 'latest_model_checkpoint.pth' and wandb if applicable.

    • Default : False
  • --save_features : Collect feature representations.

    • Default : False
  • --save_features_phase : Dataset split of representations to collect.

    • Choices : "train", "val", "test"
    • Default : "train"
  • --config : If you have a yaml file with appropriate config, provide the path here. Will override the 'experiment_maker'.

    • Default : None

πŸ‹οΈβ€β™‚οΈ Trained weights

Experiment CIFAR100-LT (ResNet32, seed 1, Imb 100) mini-ImageNet-LT (ResNeXt50)
TailCalib Git-LFS Git-LFS
TailCalibX Git-LFS Git-LFS
CBD + TailCalibX Git-LFS Git-LFS

πŸͺ€ Results on a Toy Dataset

Open In Colab

The higher the Imb ratio, the more imbalanced the dataset is. Imb ratio = maximum_sample_count / minimum_sample_count.

Check this notebook to play with the toy example from which the plot below was generated.

🌴 Directory Tree

TailCalibX
β”œβ”€β”€ libs
β”‚   β”œβ”€β”€ core
β”‚   β”‚   β”œβ”€β”€ ce.py
β”‚   β”‚   β”œβ”€β”€ core_base.py
β”‚   β”‚   β”œβ”€β”€ ecbd.py
β”‚   β”‚   β”œβ”€β”€ modals.py
β”‚   β”‚   β”œβ”€β”€ TailCalib.py
β”‚   β”‚   └── TailCalibX.py
β”‚   β”œβ”€β”€ data
β”‚   β”‚   β”œβ”€β”€ dataloader.py
β”‚   β”‚   β”œβ”€β”€ ImbalanceCIFAR.py
β”‚   β”‚   └── mini-imagenet
β”‚   β”‚       β”œβ”€β”€ 0.01_test.txt
β”‚   β”‚       β”œβ”€β”€ 0.01_train.txt
β”‚   β”‚       └── 0.01_val.txt
β”‚   β”œβ”€β”€ loss
β”‚   β”‚   β”œβ”€β”€ CosineDistill.py
β”‚   β”‚   └── SoftmaxLoss.py
β”‚   β”œβ”€β”€ models
β”‚   β”‚   β”œβ”€β”€ CosineDotProductClassifier.py
β”‚   β”‚   β”œβ”€β”€ DotProductClassifier.py
β”‚   β”‚   β”œβ”€β”€ ecbd_converter.py
β”‚   β”‚   β”œβ”€β”€ ResNet32Feature.py
β”‚   β”‚   β”œβ”€β”€ ResNext50Feature.py
β”‚   β”‚   └── ResNextFeature.py
β”‚   β”œβ”€β”€ samplers
β”‚   β”‚   └── ClassAwareSampler.py
β”‚   └── utils
β”‚       β”œβ”€β”€ Default_config.yaml
β”‚       β”œβ”€β”€ experiments_maker.py
β”‚       β”œβ”€β”€ globals.py
β”‚       β”œβ”€β”€ logger.py
β”‚       └── utils.py
β”œβ”€β”€ LICENSE
β”œβ”€β”€ main.py
β”œβ”€β”€ Notebooks
β”‚   β”œβ”€β”€ Create_mini-ImageNet-LT.ipynb
β”‚   └── toy_example.ipynb
β”œβ”€β”€ readme_assets
β”‚   β”œβ”€β”€ method.svg
β”‚   └── toy_example_output.svg
β”œβ”€β”€ README.md
β”œβ”€β”€ run_all_CIFAR100-LT.sh
β”œβ”€β”€ run_all_mini-ImageNet-LT.sh
β”œβ”€β”€ run_TailCalibX_CIFAR100-LT.sh
└── run_TailCalibX_mini-imagenet-LT.sh

Ignored tailcalib_pip as it is for the tailcalib pip package.

πŸ“ƒ Citation

@inproceedings{rahul2021tailcalibX,
    title   = {{Feature Generation for Long-tail Classification}},
    author  = {Rahul Vigneswaran and Marc T. Law and Vineeth N. Balasubramanian and Makarand Tapaswi},
    booktitle = {ICVGIP},
    year = {2021}
}

πŸ‘ Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

❀ About me

Rahul Vigneswaran

✨ Extras

🐝 Long-tail buzz : If you are interested in deep learning research which involves long-tailed / imbalanced dataset, take a look at Long-tail buzz to learn about the recent trending papers in this field.

πŸ“ License

MIT

Owner
Rahul Vigneswaran
Rahul Vigneswaran
OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021) Video demo We here provide a video demo from co

20 Nov 25, 2022
ONNX-PackNet-SfM: Python scripts for performing monocular depth estimation using the PackNet-SfM model in ONNX

Python scripts for performing monocular depth estimation using the PackNet-SfM model in ONNX

Ibai Gorordo 14 Dec 09, 2022
This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' published at ECIR'22.

Paragraph Aggregation Retrieval Model (PARM) for Dense Document-to-Document Retrieval This repository contains the code for the paper PARM: A Paragrap

Sophia Althammer 33 Aug 26, 2022
A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains (IJCV submission)

wsss-analysis The code of: A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains, arXiv pre-print 2019 paper.

Lyndon Chan 48 Dec 18, 2022
Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging"

Deep Optics for Single-shot High-dynamic-range Imaging Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging" CVPR, 2

Stanford Computational Imaging Lab 40 Dec 12, 2022
Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Music Source Separation with Channel-wise Subband Phase Aware ResUnet (CWS-PResUNet) Introduction This repo contains the pretrained Music Source Separ

Lau 100 Dec 25, 2022
The repository for freeCodeCamp's YouTube course, Algorithmic Trading in Python

Algorithmic Trading in Python This repository Course Outline Section 1: Algorithmic Trading Fundamentals What is Algorithmic Trading? The Differences

Nick McCullum 1.8k Jan 02, 2023
Scrutinizing XAI with linear ground-truth data

This repository contains all the experiments presented in the corresponding paper: "Scrutinizing XAI using linear ground-truth data with suppressor va

braindata lab 2 Oct 04, 2022
Random Erasing Data Augmentation. Experiments on CIFAR10, CIFAR100 and Fashion-MNIST

Random Erasing Data Augmentation =============================================================== black white random This code has the source code for

Zhun Zhong 654 Dec 26, 2022
Educational 2D SLAM implementation based on ICP and Pose Graph

slam-playground Educational 2D SLAM implementation based on ICP and Pose Graph How to use: Use keyboard arrow keys to navigate robot. Press 'r' to vie

Kirill 19 Dec 17, 2022
Train the HRNet model on ImageNet

High-resolution networks (HRNets) for Image classification News [2021/01/20] Add some stronger ImageNet pretrained models, e.g., the HRNet_W48_C_ssld_

HRNet 866 Jan 04, 2023
Official implementation of Self-supervised Image-to-text and Text-to-image Synthesis

Self-supervised Image-to-text and Text-to-image Synthesis This is the official implementation of Self-supervised Image-to-text and Text-to-image Synth

6 Jul 31, 2022
Sequence-tagging using deep learning

Classification using Deep Learning Requirements PyTorch version = 1.9.1+cu111 Python version = 3.8.10 PyTorch-Lightning version = 1.4.9 Huggingface

Vineet Kumar 2 Dec 20, 2022
Code for the ICCV'21 paper "Context-aware Scene Graph Generation with Seq2Seq Transformers"

ICCV'21 Context-aware Scene Graph Generation with Seq2Seq Transformers Authors: Yichao Lu*, Himanshu Rai*, Cheng Chang*, Boris Knyazev†, Guangwei Yu,

Layer6 Labs 37 Dec 18, 2022
Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

CSRL Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning Python: 3

4 Apr 14, 2022
Multivariate Time Series Forecasting with efficient Transformers. Code for the paper "Long-Range Transformers for Dynamic Spatiotemporal Forecasting."

Spacetimeformer Multivariate Forecasting This repository contains the code for the paper, "Long-Range Transformers for Dynamic Spatiotemporal Forecast

QData 440 Jan 02, 2023
Tutorial materials for Part of NSU Intro to Deep Learning with PyTorch.

Intro to Deep Learning Materials are part of North South University (NSU) Intro to Deep Learning with PyTorch workshop series. (Slides) Related materi

Hasib Zunair 9 Jun 08, 2022
A Simple and Versatile Framework for Object Detection and Instance Recognition

SimpleDet - A Simple and Versatile Framework for Object Detection and Instance Recognition Major Features FP16 training for memory saving and up to 2.

TuSimple 3k Dec 12, 2022
FB-tCNN for SSVEP Recognition

FB-tCNN for SSVEP Recognition Here are the codes of the tCNN and FB-tCNN in the paper "Filter Bank Convolutional Neural Network for Short Time-Window

Wenlong Ding 12 Dec 14, 2022
A sample pytorch Implementation of ACL 2021 research paper "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

Span-ASTE-Pytorch This repository is a pytorch version that implements Ali's ACL 2021 research paper Learning Span-Level Interactions for Aspect Senti

ζ₯θ‡ͺδΈΉιΊ¦ηš„ε€©η± 10 Dec 06, 2022