This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive Selective Coding)

Related tags

Deep LearningHCSC
Overview

HCSC: Hierarchical Contrastive Selective Coding

This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive Selective Coding), whose details are in this paper.

HCSC is an effective and efficient method to pre-train image encoders in a self-supervised fashion. In general, this method seeks to learn image representations with hierarchical semantic structures. It utilizes hierarchical K-means to derive hierarchical prototypes, and these prototypes represent the hierarchical semantics underlying the data. On such basis, we perform Instance-wise and Prototypical Contrastive Selective Coding to inject the information within hierarchical prototypes into image representations. HCSC has achieved SOTA performance on the self-supervised pre-training of CNNs (e.g., ResNet-50), and we will further study its potential on pre-training Vision Transformers.

Roadmap

  • [2022/02/01] The initial release! We release all source code for pre-training and downstream evaluation. We release three pre-trained ResNet-50 models: 200 epochs (single-crop), 200 epochs (multi-crop) and 400 epochs (single-crop, batch size: 256).

TODO

  • Finish the pre-training of 400 epochs ResNet-50 models (multi-crop) and release.
  • Finish the pre-training of 800 epochs ResNet-50 models (single- & multi-crop) and release.
  • Support Vision Transformer backbones.
  • Pre-train Vision Transformers with HCSC and release model weights under various configurations.

Model Zoo

We will continually release our pre-trained HCSC model weights and corresponding training configs. The current finished ones are as follows:

Backbone Method Crop Epoch Batch size Lincls top-1 Acc. KNN top-1 Acc. url config
ResNet-50 HCSC Single 200 256 69.2 60.7 model config
ResNet-50 HCSC Multi 200 256 73.3 66.6 model config
ResNet-50 HCSC Single 400 256 70.6 63.4 model config

Installation

Use following command to install dependencies (python3.7 with pip installed):

pip3 install -r requirement.txt

If having trouble installing PyTorch, follow the original guidance (https://pytorch.org/). Notably, the code is tested with cudatoolkit version 10.2.

Pre-training on ImageNet

Download ImageNet dataset under [ImageNet Folder]. Go to the path "[ImageNet Folder]/val" and use this script to build sub-folders.

To train single-crop HCSC on 8 Tesla-V100-32GB GPUs for 200 epochs, run:

python3 -m torch.distributed.launch --master_port [your port] --nproc_per_node=8 \
pretrain.py [your ImageNet Folder]

To train multi-crop HCSC on 8 Tesla-V100-32GB GPUs for 200 epochs, run:

python3 -m torch.distributed.launch --master_port [your port] --nproc_per_node=8 \
pretrain.py --multicrop [your ImageNet Folder]

Downstream Evaluation

Evaluation: Linear Classification on ImageNet

With a pre-trained model, to train a supervised linear classifier with all available GPUs, run:

python3 eval_lincls_imagenet.py --data [your ImageNet Folder] \
--dist-url tcp://localhost:10001 --world-size 1 --rank 0 \
--pretrained [your pre-trained model (example:out.pth)]

Evaluation: KNN Evaluation on ImageNet

To reproduce the KNN evaluation results with a pre-trained model using a single GPU, run:

python3 -m torch.distributed.launch --master_port [your port] --nproc_per_node=1 eval_knn.py \
--checkpoint_key state_dict \
--pretrained [your pre-trained model] \
--data [your ImageNet Folder]

Evaluation: Semi-supervised Learning on ImageNet

To fine-tune a pre-trained model with 1% or 10% ImageNet labels with 8 Tesla-V100-32GB GPUs, run:

1% of labels:

python3 -m torch.distributed.launch --nproc_per_node 8 --master_port [your port] eval_semisup.py \
--labels_perc 1 \
--pretrained [your pretrained weights] \
[your ImageNet Folder]

10% of labels:

python3 -m torch.distributed.launch --nproc_per_node 8 --master_port [your port] eval_semisup.py \
--labels_perc 10 \
--pretrained [your pretrained weights] \
[your ImageNet Folder]

Evaluation: Transfer Learning - Classification on VOC / Places205

VOC

1. Download the VOC dataset.
2. Finetune and evaluate on PASCAL VOC (with a single GPU):
cd voc_cls/ 
python3 main.py --data [your voc data folder] \
--pretrained [your pretrained weights]

Places205

1. Download the Places205 dataset (resized 256x256 version)
2. Linear Classification on Places205 (with all available GPUs):
python3 eval_lincls_places.py --data [your places205 data folder] \
--data-url tcp://localhost:10001 \
--pretrained [your pretrained weights]

Evaluation: Transfer Learning - Object Detection on VOC / COCO

1. Download VOC and COCO Dataset (under ./detection/datasets).

2. Install detectron2.

3. Convert a pre-trained model to the format of detectron2:

cd detection
python3 convert-pretrain-to-detectron2.py [your pretrained weight] out.pkl

4. Train on PASCAL VOC/COCO:

Finetune and evaluate on VOC (with 8 Tesla-V100-32GB GPUs):
cd detection
python3 train_net.py --config-file ./configs/pascal_voc_R_50_C4_24k_hcsc.yaml \
--num-gpus 8 MODEL.WEIGHTS out.pkl
Finetune and evaluate on COCO (with 8 Tesla-V100-32GB GPUs):
cd detection
python3 train_net.py --config-file ./configs/coco_R_50_C4_2x_hcsc.yaml \
--num-gpus 8 MODEL.WEIGHTS out.pkl

Evaluation: Clustering Evaluation on ImageNet

To reproduce the clustering evaluation results with a pre-trained model using all available GPUs, run:

python3 eval_clustering.py --dist-url tcp://localhost:10001 \
--multiprocessing-distributed --world-size 1 --rank 0 \
--num-cluster [target num cluster] \
--pretrained [your pretrained model weights] \
[your ImageNet Folder]

In the experiments of our paper, we set --num-cluster as 25000 and 1000.

License

This repository is released under the MIT license as in the LICENSE file.

Citation

If you find this repository useful, please kindly consider citing the following paper:

@article{guo2022hcsc,
  title={HCSC: Hierarchical Contrastive Selective Coding},
  author={Guo, Yuanfan and Xu, Minghao and Li, Jiawen and Ni, Bingbing and Zhu, Xuanyu and Sun, Zhenbang and Xu, Yi},
  journal={arXiv preprint arXiv:2202.00455},
  year={2022}
}
Owner
YUANFAN GUO
From SJTU. Working on self-supervised pre-training.
YUANFAN GUO
OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages

OCR-Streamlit-App OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages OCR app gets an image a

Siva Prakash 5 Apr 05, 2022
Pytorch implementation of various High Dynamic Range (HDR) Imaging algorithms

Deep High Dynamic Range Imaging Benchmark This repository is the pytorch impleme

Tianhong Dai 5 Nov 16, 2022
PyTorch code accompanying our paper on Maximum Entropy Generators for Energy-Based Models

Maximum Entropy Generators for Energy-Based Models All experiments have tensorboard visualizations for samples / density / train curves etc. To run th

Rithesh Kumar 135 Oct 27, 2022
This is a collection of simple PyTorch implementations of neural networks and related algorithms. These implementations are documented with explanations,

labml.ai Deep Learning Paper Implementations This is a collection of simple PyTorch implementations of neural networks and related algorithms. These i

labml.ai 16.4k Jan 09, 2023
ViDT: An Efficient and Effective Fully Transformer-based Object Detector

ViDT: An Efficient and Effective Fully Transformer-based Object Detector by Hwanjun Song1, Deqing Sun2, Sanghyuk Chun1, Varun Jampani2, Dongyoon Han1,

NAVER AI 262 Dec 27, 2022
Code for paper Novel View Synthesis via Depth-guided Skip Connections

Novel View Synthesis via Depth-guided Skip Connections Code for paper Novel View Synthesis via Depth-guided Skip Connections @InProceedings{Hou_2021_W

8 Mar 14, 2022
[ICML 2021] Break-It-Fix-It: Learning to Repair Programs from Unlabeled Data

Break-It-Fix-It: Learning to Repair Programs from Unlabeled Data This repo provides the source code & data of our paper: Break-It-Fix-It: Unsupervised

Michihiro Yasunaga 86 Nov 30, 2022
[ICCV'21] Official implementation for the paper Social NCE: Contrastive Learning of Socially-aware Motion Representations

CrowdNav with Social-NCE This is an official implementation for the paper Social NCE: Contrastive Learning of Socially-aware Motion Representations by

VITA lab at EPFL 125 Dec 23, 2022
An Unpaired Sketch-to-Photo Translation Model

Unpaired-Sketch-to-Photo-Translation We have released our code at https://github.com/rt219/Unsupervised-Sketch-to-Photo-Synthesis This project is the

38 Oct 28, 2022
Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN Official PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. Prerequisites Python 2.7

SK T-Brain 754 Dec 29, 2022
InterfaceGAN++: Exploring the limits of InterfaceGAN

InterfaceGAN++: Exploring the limits of InterfaceGAN Authors: Apavou Clément & Belkada Younes From left to right - Images generated using styleGAN and

Younes Belkada 42 Dec 23, 2022
Code for reproducing experiments in "Improved Training of Wasserstein GANs"

Improved Training of Wasserstein GANs Code for reproducing experiments in "Improved Training of Wasserstein GANs". Prerequisites Python, NumPy, Tensor

Ishaan Gulrajani 2.2k Jan 01, 2023
Ensemble Learning Priors Driven Deep Unfolding for Scalable Snapshot Compressive Imaging [PyTorch]

Ensemble Learning Priors Driven Deep Unfolding for Scalable Snapshot Compressive Imaging [PyTorch] Abstract Snapshot compressive imaging (SCI) can rec

integirty 6 Nov 01, 2022
A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)

One-Stage Visual Grounding ***** New: Our recent work on One-stage VG is available at ReSC.***** A Fast and Accurate One-Stage Approach to Visual Grou

Zhengyuan Yang 118 Dec 05, 2022
Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks

flownet2-pytorch Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks. Multiple GPU training is supported, a

NVIDIA Corporation 2.8k Dec 27, 2022
An NVDA add-on to split screen reader and audio from other programs to different sound channels

An NVDA add-on to split screen reader and audio from other programs to different sound channels (add-on idea credit: Tony Malykh)

Joseph Lee 7 Dec 25, 2022
Tensorforce: a TensorFlow library for applied reinforcement learning

Tensorforce: a TensorFlow library for applied reinforcement learning Introduction Tensorforce is an open-source deep reinforcement learning framework,

Tensorforce 3.2k Jan 02, 2023
The Video-based Accident Detection System built in Python

Accident-detection-system About the Project This Repository contains the Video-based Accident Detection System built in Python. Contributors Yukta Gop

SURYAVANSHI SNEHAL BALKRISHNA 50 Dec 07, 2022
Certified Patch Robustness via Smoothed Vision Transformers

Certified Patch Robustness via Smoothed Vision Transformers This repository contains the code for replicating the results of our paper: Certified Patc

Madry Lab 35 Dec 14, 2022
PyTorch implementation of the Flow Gaussian Mixture Model (FlowGMM) model from our paper

Flow Gaussian Mixture Model (FlowGMM) This repository contains a PyTorch implementation of the Flow Gaussian Mixture Model (FlowGMM) model from our pa

Pavel Izmailov 124 Nov 06, 2022