This is the code of paper ``Contrastive Coding for Active Learning under Class Distribution Mismatch'' with python.

Related tags

Deep LearningCCAL
Overview

Contrastive Coding for Active Learning under Class Distribution Mismatch

Official PyTorch implementation of ["Contrastive Coding for Active Learning under Class Distribution Mismatch"]( ICCV2021)

1. Requirements

Environments

Currently, requires following packages.

  • CUDA 10.1+
  • python == 3.7.9
  • pytorch == 1.7.1
  • torchvision == 0.8.2
  • scikit-learn == 0.24.0
  • tensorboardx == 2.1
  • matplotlib == 3.3.3
  • numpy == 1.19.2
  • scipy == 1.5.3
  • apex == 0.1
  • diffdist == 0.1
  • pytorch-gradual-warmup-lr packages

Datasets

For CIFAR10 and CIFAR100, we provide a function to automatically download and preprocess the data, you can also download the datasets from the link, and please download it to ~/data.

2. Training

Currently, all code examples are assuming distributed launch with 4 multi GPUs. To run the code with single GPU, remove -m torch.distributed.launch --nproc_per_node=4.

Semantic feature extraction

To train semantic feature extraction in the paper, run this command:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 contrast_main.py --mismatch 0.8 --dataset <DATASET> --model <NETWORK> --mode senmatic --shift_trans_type none --batch_size 32 --epoch <EPOCH> --logdir './model/semantic'
  • Option
  • For CIFAR10, set --datatset cifar10, else set --datatset cifar100.
  • In our experiment, we set --epoch 700 in cfar10 and --epoch 2000 in cifar100 .
  • And we set mismatch = 0.2, 0.4, 0.6, 0.8.

Distinctive feature extraction

To train distinctive feature extraction in the paper, run this command:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 contrast_main.py --mismatch 0.8 --dataset <DATASET> --model <NETWORK> --mode feature --shift_trans_type rotation --batch_size 32 --epoch 700 --logdir './model/distinctive'
  • Option
  • For CIFAR10, set --datatset cifar10, else set --datatset cifar100.
  • In our experiment, we set --epoch 700 in cifar10 and cifar100 .
  • And we set mismatch = 0.2, 0.4, 0.6, 0.8.

Joint query strategy

To select samples from unlabeled dataset in the paper, run this command:

CUDA_VISIBLE_DEVICES=0 python active_main.py --mode eval --k 100.0 --t 0.9 --dataset <DATASET> --model <NETWORK> --mismatch <MISMATCH> --target <INT> --shift_trans_type rotation --print_score --ood_samples 10 --resize_factor 0.54 --resize_fix --load_feature_path './model/distinctive/last.model' --load_senmatic_path './model/semantic/last.model'  --load_path './model'
  • Option
  • For CIFAR10, set --datatset cifar10, else set --datatset cifar100.
  • The value of mismatch is between 0 and 1. In our experiment, we set mismatch = 0.2, 0.4, 0.6, 0.8.
  • --target represents the number of queried samples in each category in each AL cycle.

Then, we can get the index of the samples be queried in each active learning cycle. Take mismatch=0.8 for example,the index of the samples should be added in to CCAL_master/train_classifier/get_index_80.

3. Evaluation

To evaluate the proformance of CCAL, we provide a script to train a classifier, as shown in CCAL_master/train_classifier. , run this command to train the classifier:

CUDA_VISIBLE_DEVICES=0 python main.py --cuda --split <CYCLES> --dataset <DATASET> --mismatch <MISMATCH> --number <NUMBER> --epoch 100
  • Option
  • For CIFAR10, set --datatset cifar10, else set --datatset cifar100.
  • The value of mismatch is between 0 and 1. In our experiment, we set mismatch = 0.2, 0.4, 0.6, 0.8. The value of mismatch should be the same as before.
  • --number indicates the cycle of active learning.
  • --epoch indicates the epochs that training continues in each active learning cycle. In our experiment, we set --epoch 100.
  • --split represents the cycles of active learning.

Then, we can get the average of the accuracies over 5 runs(random seed = 0,1,2,3,4,5).

4. Citation

@InProceedings{Du_2021_ICCV,
    author    = {Du, Pan and Zhao, Suyun and Chen, Hui and Chai, Shuwen and Chen, Hong and Li, Cuiping},
    title     = {Contrastive Coding for Active Learning Under Class Distribution Mismatch},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {8927-8936}
}

5. Reference

@inproceedings{tack2020csi,
  title={CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted Instances},
  author={Jihoon Tack and Sangwoo Mo and Jongheon Jeong and Jinwoo Shin},
  booktitle={Advances in Neural Information Processing Systems},
  year={2020}
}
Blender add-on: Add to Cameras menu: View → Camera, View → Add Camera, Camera → View, Previous Camera, Next Camera

Blender add-on: Camera additions In 3D view, it adds these actions to the View|Cameras menu: View → Camera : set the current camera to the 3D view Vie

German Bauer 11 Feb 08, 2022
Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels.

The Face Synthetics dataset Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels. It was introduced in ou

Microsoft 608 Jan 02, 2023
This repository includes different versions of the prescribed-time controller as Simulink blocks and MATLAB script codes for engineering applications.

Prescribed-time Control Prescribed-time control (PTC) blocks in Simulink environment, MATLAB R2020b. For more theoretical details, refer to the papers

Amir Shakouri 1 Mar 11, 2022
Tilted Empirical Risk Minimization (ICLR '21)

Tilted Empirical Risk Minimization This repository contains the implementation for the paper Tilted Empirical Risk Minimization ICLR 2021 Empirical ri

Tian Li 40 Nov 28, 2022
CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding This repo contains the data and source code for baseline models in the NeurIPS 2

Microsoft 29 Dec 29, 2022
Official repository for CVPR21 paper "Deep Stable Learning for Out-Of-Distribution Generalization".

StableNet StableNet is a deep stable learning method for out-of-distribution generalization. This is the official repo for CVPR21 paper "Deep Stable L

120 Dec 28, 2022
Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper

Ponder(ing) Transformer Implementation of a Transformer that learns to adapt the number of computational steps it takes depending on the difficulty of

Phil Wang 65 Oct 04, 2022
A testcase generation tool for Persistent Memory Programs.

PMFuzz PMFuzz is a testcase generation tool to generate high-value tests cases for PM testing tools (XFDetector, PMDebugger, PMTest and Pmemcheck) If

Systems Research at ShiftLab 14 Jul 24, 2022
Towards Implicit Text-Guided 3D Shape Generation (CVPR2022)

Towards Implicit Text-Guided 3D Shape Generation Towards Implicit Text-Guided 3D Shape Generation (CVPR2022) Code for the paper [Towards Implicit Text

55 Dec 16, 2022
Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Transformer-vocabulary-transfer Implementation of the paper "Fine-Tuning Transfo

LEYA 13 Nov 30, 2022
Single Red Blood Cell Hydrodynamic Traps Via the Generative Design

Rbc-traps-generative-design - The generative design for single red clood cell hydrodynamic traps using GEFEST framework

Natural Systems Simulation Lab 4 Jun 16, 2022
TANL: Structured Prediction as Translation between Augmented Natural Languages

TANL: Structured Prediction as Translation between Augmented Natural Languages Code for the paper "Structured Prediction as Translation between Augmen

98 Dec 15, 2022
Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)

Universal Adversarial Triggers for Attacking and Analyzing NLP This is the official code for the EMNLP 2019 paper, Universal Adversarial Triggers for

Eric Wallace 248 Dec 17, 2022
NeRD: Neural Reflectance Decomposition from Image Collections

NeRD: Neural Reflectance Decomposition from Image Collections Project Page | Video | Paper | Dataset Implementation for NeRD. A novel method which dec

Computergraphics (University of Tübingen) 195 Dec 29, 2022
To SMOTE, or not to SMOTE?

To SMOTE, or not to SMOTE? This package includes the code required to repeat the experiments in the paper and to analyze the results. To SMOTE, or not

Amazon Web Services 1 Jan 03, 2022
Implementation of 'lightweight' GAN, proposed in ICLR 2021, in Pytorch. High resolution image generations that can be trained within a day or two

512x512 flowers after 12 hours of training, 1 gpu 256x256 flowers after 12 hours of training, 1 gpu Pizza 'Lightweight' GAN Implementation of 'lightwe

Phil Wang 1.5k Jan 02, 2023
Official implementation of the paper "Steganographer Detection via a Similarity Accumulation Graph Convolutional Network"

SAGCN - Official PyTorch Implementation | Paper | Project Page This is the official implementation of the paper "Steganographer detection via a simila

ZHANG Zhi 1 Nov 26, 2021
Select, weight and analyze complex sample data

Sample Analytics In large-scale surveys, often complex random mechanisms are used to select samples. Estimates derived from such samples must reflect

samplics 37 Dec 15, 2022
TensorFlow implementation of the algorithm in the paper "Decoupled Low-light Image Enhancement"

Decoupled Low-light Image Enhancement Shijie Hao1,2*, Xu Han1,2, Yanrong Guo1,2 & Meng Wang1,2 1Key Laboratory of Knowledge Engineering with Big Data

17 Apr 25, 2022