This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive Selective Coding)

Related tags

Deep LearningHCSC
Overview

HCSC: Hierarchical Contrastive Selective Coding

This repository provides a PyTorch implementation and model weights for HCSC (Hierarchical Contrastive Selective Coding), whose details are in this paper.

HCSC is an effective and efficient method to pre-train image encoders in a self-supervised fashion. In general, this method seeks to learn image representations with hierarchical semantic structures. It utilizes hierarchical K-means to derive hierarchical prototypes, and these prototypes represent the hierarchical semantics underlying the data. On such basis, we perform Instance-wise and Prototypical Contrastive Selective Coding to inject the information within hierarchical prototypes into image representations. HCSC has achieved SOTA performance on the self-supervised pre-training of CNNs (e.g., ResNet-50), and we will further study its potential on pre-training Vision Transformers.

Roadmap

  • [2022/02/01] The initial release! We release all source code for pre-training and downstream evaluation. We release three pre-trained ResNet-50 models: 200 epochs (single-crop), 200 epochs (multi-crop) and 400 epochs (single-crop, batch size: 256).

TODO

  • Finish the pre-training of 400 epochs ResNet-50 models (multi-crop) and release.
  • Finish the pre-training of 800 epochs ResNet-50 models (single- & multi-crop) and release.
  • Support Vision Transformer backbones.
  • Pre-train Vision Transformers with HCSC and release model weights under various configurations.

Model Zoo

We will continually release our pre-trained HCSC model weights and corresponding training configs. The current finished ones are as follows:

Backbone Method Crop Epoch Batch size Lincls top-1 Acc. KNN top-1 Acc. url config
ResNet-50 HCSC Single 200 256 69.2 60.7 model config
ResNet-50 HCSC Multi 200 256 73.3 66.6 model config
ResNet-50 HCSC Single 400 256 70.6 63.4 model config

Installation

Use following command to install dependencies (python3.7 with pip installed):

pip3 install -r requirement.txt

If having trouble installing PyTorch, follow the original guidance (https://pytorch.org/). Notably, the code is tested with cudatoolkit version 10.2.

Pre-training on ImageNet

Download ImageNet dataset under [ImageNet Folder]. Go to the path "[ImageNet Folder]/val" and use this script to build sub-folders.

To train single-crop HCSC on 8 Tesla-V100-32GB GPUs for 200 epochs, run:

python3 -m torch.distributed.launch --master_port [your port] --nproc_per_node=8 \
pretrain.py [your ImageNet Folder]

To train multi-crop HCSC on 8 Tesla-V100-32GB GPUs for 200 epochs, run:

python3 -m torch.distributed.launch --master_port [your port] --nproc_per_node=8 \
pretrain.py --multicrop [your ImageNet Folder]

Downstream Evaluation

Evaluation: Linear Classification on ImageNet

With a pre-trained model, to train a supervised linear classifier with all available GPUs, run:

python3 eval_lincls_imagenet.py --data [your ImageNet Folder] \
--dist-url tcp://localhost:10001 --world-size 1 --rank 0 \
--pretrained [your pre-trained model (example:out.pth)]

Evaluation: KNN Evaluation on ImageNet

To reproduce the KNN evaluation results with a pre-trained model using a single GPU, run:

python3 -m torch.distributed.launch --master_port [your port] --nproc_per_node=1 eval_knn.py \
--checkpoint_key state_dict \
--pretrained [your pre-trained model] \
--data [your ImageNet Folder]

Evaluation: Semi-supervised Learning on ImageNet

To fine-tune a pre-trained model with 1% or 10% ImageNet labels with 8 Tesla-V100-32GB GPUs, run:

1% of labels:

python3 -m torch.distributed.launch --nproc_per_node 8 --master_port [your port] eval_semisup.py \
--labels_perc 1 \
--pretrained [your pretrained weights] \
[your ImageNet Folder]

10% of labels:

python3 -m torch.distributed.launch --nproc_per_node 8 --master_port [your port] eval_semisup.py \
--labels_perc 10 \
--pretrained [your pretrained weights] \
[your ImageNet Folder]

Evaluation: Transfer Learning - Classification on VOC / Places205

VOC

1. Download the VOC dataset.
2. Finetune and evaluate on PASCAL VOC (with a single GPU):
cd voc_cls/ 
python3 main.py --data [your voc data folder] \
--pretrained [your pretrained weights]

Places205

1. Download the Places205 dataset (resized 256x256 version)
2. Linear Classification on Places205 (with all available GPUs):
python3 eval_lincls_places.py --data [your places205 data folder] \
--data-url tcp://localhost:10001 \
--pretrained [your pretrained weights]

Evaluation: Transfer Learning - Object Detection on VOC / COCO

1. Download VOC and COCO Dataset (under ./detection/datasets).

2. Install detectron2.

3. Convert a pre-trained model to the format of detectron2:

cd detection
python3 convert-pretrain-to-detectron2.py [your pretrained weight] out.pkl

4. Train on PASCAL VOC/COCO:

Finetune and evaluate on VOC (with 8 Tesla-V100-32GB GPUs):
cd detection
python3 train_net.py --config-file ./configs/pascal_voc_R_50_C4_24k_hcsc.yaml \
--num-gpus 8 MODEL.WEIGHTS out.pkl
Finetune and evaluate on COCO (with 8 Tesla-V100-32GB GPUs):
cd detection
python3 train_net.py --config-file ./configs/coco_R_50_C4_2x_hcsc.yaml \
--num-gpus 8 MODEL.WEIGHTS out.pkl

Evaluation: Clustering Evaluation on ImageNet

To reproduce the clustering evaluation results with a pre-trained model using all available GPUs, run:

python3 eval_clustering.py --dist-url tcp://localhost:10001 \
--multiprocessing-distributed --world-size 1 --rank 0 \
--num-cluster [target num cluster] \
--pretrained [your pretrained model weights] \
[your ImageNet Folder]

In the experiments of our paper, we set --num-cluster as 25000 and 1000.

License

This repository is released under the MIT license as in the LICENSE file.

Citation

If you find this repository useful, please kindly consider citing the following paper:

@article{guo2022hcsc,
  title={HCSC: Hierarchical Contrastive Selective Coding},
  author={Guo, Yuanfan and Xu, Minghao and Li, Jiawen and Ni, Bingbing and Zhu, Xuanyu and Sun, Zhenbang and Xu, Yi},
  journal={arXiv preprint arXiv:2202.00455},
  year={2022}
}
Owner
YUANFAN GUO
From SJTU. Working on self-supervised pre-training.
YUANFAN GUO
(ICCV 2021) ProHMR - Probabilistic Modeling for Human Mesh Recovery

ProHMR - Probabilistic Modeling for Human Mesh Recovery Code repository for the paper: Probabilistic Modeling for Human Mesh Recovery Nikos Kolotouros

Nikos Kolotouros 209 Dec 13, 2022
Springer Link Download Module for Python

♞ pupalink A simple Python module to search and download books from SpringerLink. 🧪 This project is still in an early stage of development. Expect br

Pupa Corp. 18 Nov 21, 2022
A PyTorch toolkit for 2D Human Pose Estimation.

PyTorch-Pose PyTorch-Pose is a PyTorch implementation of the general pipeline for 2D single human pose estimation. The aim is to provide the interface

Wei Yang 1.1k Dec 30, 2022
The official PyTorch code for 'DER: Dynamically Expandable Representation for Class Incremental Learning' accepted by CVPR2021

DER.ClassIL.Pytorch This repo is the official implementation of DER: Dynamically Expandable Representation for Class Incremental Learning (CVPR 2021)

rhyssiyan 108 Jan 01, 2023
Huawei Hackathon 2021 - Sweden (Stockholm)

huawei-hackathon-2021 Contributors DrakeAxelrod Challenge Requirements: python=3.8.10 Standard libraries (no importing) Important factors: Data depend

Drake Axelrod 32 Nov 08, 2022
Improving XGBoost survival analysis with embeddings and debiased estimators

xgbse: XGBoost Survival Embeddings "There are two cultures in the use of statistical modeling to reach conclusions from data

Loft 242 Dec 30, 2022
Pgn2tex - Scripts to convert pgn files to latex document. Useful to build books or pdf from pgn studies

Pgn2Latex (WIP) A simple script to make pdf from pgn files and studies. It's sti

12 Jul 23, 2022
Hcaptcha-challenger - Gracefully face hCaptcha challenge with Yolov5(ONNX) embedded solution

hCaptcha Challenger 🚀 Gracefully face hCaptcha challenge with Yolov5(ONNX) embe

593 Jan 03, 2023
Code for ECCV 2020 paper "Contacts and Human Dynamics from Monocular Video".

Contact and Human Dynamics from Monocular Video This is the official implementation for the ECCV 2020 spotlight paper by Davis Rempe, Leonidas J. Guib

Davis Rempe 207 Jan 05, 2023
(NeurIPS 2021) Pytorch implementation of paper "Re-ranking for image retrieval and transductive few-shot classification"

SSR (NeurIPS 2021) Pytorch implementation of paper "Re-ranking for image retrieval and transductivefew-shot classification" [Paper] [Project webpage]

xshen 29 Dec 06, 2022
JAXDL: JAX (Flax) Deep Learning Library

JAXDL: JAX (Flax) Deep Learning Library Simple and clean JAX/Flax deep learning algorithm implementations: Soft-Actor-Critic (arXiv:1812.05905) Transf

Patrick Hart 4 Nov 27, 2022
This is the official repository for our paper: ''Pruning Self-attentions into Convolutional Layers in Single Path''.

Pruning Self-attentions into Convolutional Layers in Single Path This is the official repository for our paper: Pruning Self-attentions into Convoluti

Zhuang AI Group 77 Dec 26, 2022
Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation (CoRL 2021)

Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation [Project website] [Paper] This project is a PyTorch i

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 6 Feb 28, 2022
Joint learning of images and text via maximization of mutual information

mutual_info_img_txt Joint learning of images and text via maximization of mutual information. This repository incorporates the algorithms presented in

Ruizhi Liao 10 Dec 22, 2022
ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels

ROCKET + MINIROCKET ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels. Data Mining and Knowledge D

298 Dec 26, 2022
Libraries, tools and tasks created and used at DeepMind Robotics.

Libraries, tools and tasks created and used at DeepMind Robotics.

DeepMind 270 Nov 30, 2022
Milano is a tool for automating hyper-parameters search for your models on a backend of your choice.

Milano (This is a research project, not an official NVIDIA product.) Documentation https://nvidia.github.io/Milano Milano (Machine learning autotuner

NVIDIA Corporation 147 Dec 17, 2022
An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

CPC_audio This code implements the Contrast Predictive Coding algorithm on audio data, as described in the paper Unsupervised Pretraining Transfers we

8 Nov 14, 2022
A real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body

DensePose: Dense Human Pose Estimation In The Wild Rıza Alp Güler, Natalia Neverova, Iasonas Kokkinos [densepose.org] [arXiv] [BibTeX] Dense human pos

Meta Research 6.4k Jan 01, 2023
You Only Look Once for Panopitic Driving Perception

You Only 👀 Once for Panoptic 🚗 Perception You Only Look at Once for Panoptic driving Perception by Dong Wu, Manwen Liao, Weitian Zhang, Xinggang Wan

Hust Visual Learning Team 1.4k Jan 04, 2023