Pytorch implementation of TailCalibX : Feature Generation for Long-tail Classification

Overview

TailCalibX : Feature Generation for Long-tail Classification

by Rahul Vigneswaran, Marc T. Law, Vineeth N. Balasubramanian, Makarand Tapaswi

[arXiv] [Code] [pip Package] [Video] TailCalibX methodology

Table of contents

๐Ÿฃ Easy Usage (Recommended way to use our method)

โš  Caution: TailCalibX is just TailCalib employed multiple times. Specifically, we generate a set of features once every epoch and use them to train the classifier. In order to mimic that, three things must be done at every epoch in the following order:

  1. Collect all the features from your dataloader.
  2. Use the tailcalib package to make the features balanced by generating samples.
  3. Train the classifier.
  4. Repeat.

๐Ÿ’ป Installation

Use the package manager pip to install tailcalib.

pip install tailcalib

๐Ÿ‘จโ€๐Ÿ’ป Example Code

Check the instruction here for a much more detailed python package information.

# Import
from tailcalib import tailcalib

# Initialize
a = tailcalib(base_engine="numpy")   # Options: "numpy", "pytorch"

# Imbalanced random fake data
import numpy as np
X = np.random.rand(200,100)
y = np.random.randint(0,10, (200,))

# Balancing the data using "tailcalib"
feat, lab, gen = a.generate(X=X, y=y)

# Output comparison
print(f"Before: {np.unique(y, return_counts=True)}")
print(f"After: {np.unique(lab, return_counts=True)}")

๐Ÿงช Advanced Usage

โœ” Things to do before you run the code from this repo

  • Change the data_root for your dataset in main.py.
  • If you are using wandb logging (Weights & Biases), make sure to change the wandb.init in main.py accordingly.

๐Ÿ“€ How to use?

  • For just the methods proposed in this paper :
    • For CIFAR100-LT: run_TailCalibX_CIFAR100-LT.sh
    • For mini-ImageNet-LT : run_TailCalibX_mini-ImageNet-LT.sh
  • For all the results show in the paper :
    • For CIFAR100-LT: run_all_CIFAR100-LT.sh
    • For mini-ImageNet-LT : run_all_mini-ImageNet-LT.sh

๐Ÿ“š How to create the mini-ImageNet-LT dataset?

Check Notebooks/Create_mini-ImageNet-LT.ipynb for the script that generates the mini-ImageNet-LT dataset with varying imbalance ratios and train-test-val splits.

โš™ Arguments

  • --seed : Select seed for fixing it.

    • Default : 1
  • --gpu : Select the GPUs to be used.

    • Default : "0,1,2,3"
  • --experiment: Experiment number (Check 'libs/utils/experiment_maker.py').

    • Default : 0.1
  • --dataset : Dataset number.

    • Choices : 0 - CIFAR100, 1 - mini-imagenet
    • Default : 0
  • --imbalance : Select Imbalance factor.

    • Choices : 0: 1, 1: 100, 2: 50, 3: 10
    • Default : 1
  • --type_of_val : Choose which dataset split to use.

    • Choices: "vt": val_from_test, "vtr": val_from_train, "vit": val_is_test
    • Default : "vit"
  • --cv1 to --cv9 : Custom variable to use in experiments - purpose changes according to the experiment.

    • Default : "1"
  • --train : Run training sequence

    • Default : False
  • --generate : Run generation sequence

    • Default : False
  • --retraining : Run retraining sequence

    • Default : False
  • --resume : Will resume from the 'latest_model_checkpoint.pth' and wandb if applicable.

    • Default : False
  • --save_features : Collect feature representations.

    • Default : False
  • --save_features_phase : Dataset split of representations to collect.

    • Choices : "train", "val", "test"
    • Default : "train"
  • --config : If you have a yaml file with appropriate config, provide the path here. Will override the 'experiment_maker'.

    • Default : None

๐Ÿ‹๏ธโ€โ™‚๏ธ Trained weights

Experiment CIFAR100-LT (ResNet32, seed 1, Imb 100) mini-ImageNet-LT (ResNeXt50)
TailCalib Git-LFS Git-LFS
TailCalibX Git-LFS Git-LFS
CBD + TailCalibX Git-LFS Git-LFS

๐Ÿช€ Results on a Toy Dataset

Open In Colab

The higher the Imb ratio, the more imbalanced the dataset is. Imb ratio = maximum_sample_count / minimum_sample_count.

Check this notebook to play with the toy example from which the plot below was generated.

๐ŸŒด Directory Tree

TailCalibX
โ”œโ”€โ”€ libs
โ”‚   โ”œโ”€โ”€ core
โ”‚   โ”‚   โ”œโ”€โ”€ ce.py
โ”‚   โ”‚   โ”œโ”€โ”€ core_base.py
โ”‚   โ”‚   โ”œโ”€โ”€ ecbd.py
โ”‚   โ”‚   โ”œโ”€โ”€ modals.py
โ”‚   โ”‚   โ”œโ”€โ”€ TailCalib.py
โ”‚   โ”‚   โ””โ”€โ”€ TailCalibX.py
โ”‚   โ”œโ”€โ”€ data
โ”‚   โ”‚   โ”œโ”€โ”€ dataloader.py
โ”‚   โ”‚   โ”œโ”€โ”€ ImbalanceCIFAR.py
โ”‚   โ”‚   โ””โ”€โ”€ mini-imagenet
โ”‚   โ”‚       โ”œโ”€โ”€ 0.01_test.txt
โ”‚   โ”‚       โ”œโ”€โ”€ 0.01_train.txt
โ”‚   โ”‚       โ””โ”€โ”€ 0.01_val.txt
โ”‚   โ”œโ”€โ”€ loss
โ”‚   โ”‚   โ”œโ”€โ”€ CosineDistill.py
โ”‚   โ”‚   โ””โ”€โ”€ SoftmaxLoss.py
โ”‚   โ”œโ”€โ”€ models
โ”‚   โ”‚   โ”œโ”€โ”€ CosineDotProductClassifier.py
โ”‚   โ”‚   โ”œโ”€โ”€ DotProductClassifier.py
โ”‚   โ”‚   โ”œโ”€โ”€ ecbd_converter.py
โ”‚   โ”‚   โ”œโ”€โ”€ ResNet32Feature.py
โ”‚   โ”‚   โ”œโ”€โ”€ ResNext50Feature.py
โ”‚   โ”‚   โ””โ”€โ”€ ResNextFeature.py
โ”‚   โ”œโ”€โ”€ samplers
โ”‚   โ”‚   โ””โ”€โ”€ ClassAwareSampler.py
โ”‚   โ””โ”€โ”€ utils
โ”‚       โ”œโ”€โ”€ Default_config.yaml
โ”‚       โ”œโ”€โ”€ experiments_maker.py
โ”‚       โ”œโ”€โ”€ globals.py
โ”‚       โ”œโ”€โ”€ logger.py
โ”‚       โ””โ”€โ”€ utils.py
โ”œโ”€โ”€ LICENSE
โ”œโ”€โ”€ main.py
โ”œโ”€โ”€ Notebooks
โ”‚   โ”œโ”€โ”€ Create_mini-ImageNet-LT.ipynb
โ”‚   โ””โ”€โ”€ toy_example.ipynb
โ”œโ”€โ”€ readme_assets
โ”‚   โ”œโ”€โ”€ method.svg
โ”‚   โ””โ”€โ”€ toy_example_output.svg
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ run_all_CIFAR100-LT.sh
โ”œโ”€โ”€ run_all_mini-ImageNet-LT.sh
โ”œโ”€โ”€ run_TailCalibX_CIFAR100-LT.sh
โ””โ”€โ”€ run_TailCalibX_mini-imagenet-LT.sh

Ignored tailcalib_pip as it is for the tailcalib pip package.

๐Ÿ“ƒ Citation

@inproceedings{rahul2021tailcalibX,
    title   = {{Feature Generation for Long-tail Classification}},
    author  = {Rahul Vigneswaran and Marc T. Law and Vineeth N. Balasubramanian and Makarand Tapaswi},
    booktitle = {ICVGIP},
    year = {2021}
}

๐Ÿ‘ Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

โค About me

Rahul Vigneswaran

โœจ Extras

๐Ÿ Long-tail buzz : If you are interested in deep learning research which involves long-tailed / imbalanced dataset, take a look at Long-tail buzz to learn about the recent trending papers in this field.

๐Ÿ“ License

MIT

Owner
Rahul Vigneswaran
Rahul Vigneswaran
Code for Multiple Instance Active Learning for Object Detection, CVPR 2021

MI-AOD Language: ็ฎ€ไฝ“ไธญๆ–‡ | English Introduction This is the code for Multiple Instance Active Learning for Object Detection (The PDF is not available tem

Tianning Yuan 269 Dec 21, 2022
This repository contains code for the paper "Disentangling Label Distribution for Long-tailed Visual Recognition", published at CVPR' 2021

Disentangling Label Distribution for Long-tailed Visual Recognition (CVPR 2021) Arxiv link Blog post This codebase is built on Causal Norm. Install co

Hyperconnect 85 Oct 18, 2022
a reimplementation of Holistically-Nested Edge Detection in PyTorch

pytorch-hed This is a personal reimplementation of Holistically-Nested Edge Detection [1] using PyTorch. Should you be making use of this work, please

Simon Niklaus 375 Dec 06, 2022
Code for: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification

Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification Prerequisite PyTorch = 1.2.0 Python3 torch

16 Dec 14, 2022
Generate image analogies using neural matching and blending

neural image analogies This is basically an implementation of this "Image Analogies" paper, In our case, we use feature maps from VGG16. The patch mat

Adam Wentz 3.5k Jan 08, 2023
Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks.

Heterogeneous Graph Benchmark Revisiting, benchmarking, and refining Heterogeneous Graph Neural Networks. Roadmap We organize our repo by task, and on

THUDM 176 Dec 17, 2022
[MICCAI'20] AlignShift: Bridging the Gap of Imaging Thickness in 3D Anisotropic Volumes

AlignShift NEW: Code for our new MICCAI'21 paper "Asymmetric 3D Context Fusion for Universal Lesion Detection" will also be pushed to this repository

Medical 3D Vision 42 Jan 06, 2023
SegNet-Basic with Keras

SegNet-Basic: What is Segnet? Deep Convolutional Encoder-Decoder Architecture for Semantic Pixel-wise Image Segmentation Segnet = (Encoder + Decoder)

Yad Konrad 81 Jun 30, 2022
Differentiable Wavetable Synthesis

Differentiable Wavetable Synthesis

4 Feb 11, 2022
Evaluating Privacy-Preserving Machine Learning in Critical Infrastructures: A Case Study on Time-Series Classification

PPML-TSA This repository provides all code necessary to reproduce the results reported in our paper Evaluating Privacy-Preserving Machine Learning in

Dominik 1 Mar 08, 2022
Code for the paper "Implicit Representations of Meaning in Neural Language Models"

Implicit Representations of Meaning in Neural Language Models Preliminaries Create and set up a conda environment as follows: conda create -n state-pr

Belinda Li 39 Nov 03, 2022
An end-to-end regression problem of predicting the price of properties in Bangalore.

Bangalore-House-Price-Prediction An end-to-end regression problem of predicting the price of properties in Bangalore. Deployed in Heroku using Flask.

Shruti Balan 1 Nov 25, 2022
Demonstration of the Model Training as a CI/CD System in Vertex AI

Model Training as a CI/CD System This project demonstrates the machine model training as a CI/CD system in GCP platform. You will see more detailed wo

Chansung Park 19 Dec 28, 2022
A PaddlePaddle implementation of Time Interval Aware Self-Attentive Sequential Recommendation.

TiSASRec.paddle A PaddlePaddle implementation of Time Interval Aware Self-Attentive Sequential Recommendation. Introduction ่ฎบๆ–‡๏ผšTime Interval Aware Sel

Paddorch 2 Nov 28, 2021
2021 National Underwater Robotics Vision Optics

2021-National-Underwater-Robotics-Vision-Optics 2021ๅนดๅ…จๅ›ฝๆฐดไธ‹ๆœบๅ™จไบบ็ฎ—ๆณ•ๅคง่ต›-ๅ…‰ๅญฆ่ต›้“-Bๆฆœ็ฒพๅบฆ็ฌฌ18ๅ (Kilian_Di็š„ๅ›ข้˜Ÿ๏ผšAๆฆœ[email pro

Di Chang 9 Nov 04, 2022
Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

flownet2-pytorch Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. Multiple GPU training is supported, a

NVIDIA Corporation 2.8k Dec 27, 2022
Baseline and template code for node21 detection track

Nodule Detection Algorithm This codebase implements a baseline model, Faster R-CNN, for the nodule detection track in NODE21. It contains all necessar

node21challenge 11 Jan 15, 2022
This repo is duplication of jwyang/faster-rcnn.pytorch

Faster RCNN Pytorch This repo is duplication of jwyang/faster-rcnn.pytorch C/C++ code are removed and easier to study. Python 3.8.5 Ubuntu 20.04.1 LTS

Kim Jihwan 1 Jan 14, 2022
Invert and perturb GAN images for test-time ensembling

GAN Ensembling Project Page | Paper | Bibtex Ensembling with Deep Generative Views. Lucy Chai, Jun-Yan Zhu, Eli Shechtman, Phillip Isola, Richard Zhan

Lucy Chai 93 Dec 08, 2022