This is the code for CVPR 2021 oral paper: Jigsaw Clustering for Unsupervised Visual Representation Learning

Overview

JigsawClustering

Jigsaw Clustering for Unsupervised Visual Representation Learning

Pengguang Chen, Shu Liu, Jiaya Jia

Introduction

This project provides an implementation for the CVPR 2021 paper "Jigsaw Clustering for Unsupervised Visual Representation Learning"

Installation

Environment

We verify our code on

  • 4x2080Ti GPUs
  • CUDA 10.1
  • python 3.7
  • torch 1.6.0
  • torchvision 0.7.0

Other similar envirouments should also work properly.

Install

We use the SyncBN from apex, please install apex refer to https://github.com/NVIDIA/apex (SyncBN from pytorch should also work properly, we will verify it later.)

We use detectron2 for the training of detection tasks. If you are willing to finetune our pretrained model on the detection task, please install detectron2 refer to https://github.com/facebookresearch/detectron2

git clone https://github.com/Jia-Research-Lab/JigsawClustering.git
cd JigsawClustering/
pip install diffdist

Dataset

Please put the data under ./datasets. The directory looks like:

datasets
│
│───ImageNet/
│   │───class1/
│   │───class2/
│   │   ...
│   └───class1000/
│   
│───coco/
│   │───annotations/
│   │───train2017/
│   └───val2017/
│
│───VOC2012/
│   
└───VOC2007/

Results and pretrained model

The pretrained model is available at here.

Task Dataset Results
Linear Evaluation ImageNet 66.4
Semi-Supervised 1% ImageNet 40.7
Semi-Supervised 10% ImageNet 63.0
Detection COCO 39.3

Training

Pre-training on ImageNet

python main.py --dist-url 'tcp://localhost:10107' --multiprocessing-distributed --world-size 1 --rank 0 \
    -a resnet50 \
    --lr 0.03 --batch-size 256 --epoch 200 \
    --save-dir outputs/jigclu_pretrain/ \
    --resume outputs/jigclu_pretrain/model_best.pth.tar \
    --loss-t 0.3 \
    --cross-ratio 0.3 \
    datasets/ImageNet/

Linear evaluation on ImageNet

python main_lincls.py --dist-url 'tcp://localhost:10007' --multiprocessing-distributed --world-size 1 --rank 0 \
    -a resnet50 \
    --lr 10.0 --batch-size 256 \
    --prefix module.encoder. \
    --pretrained outputs/jigclu_pretrain/model_best.pth.tar \
    --save-dir outputs/jigclu_linear/ \
    datasets/ImageNet/

Semi-Supervised finetune on ImageNet

10% label

python main_semi.py --dist-url 'tcp://localhost:10102' --multiprocessing-distributed --world-size 1 --rank 0 \
    -a resnet50 \
    --batch-size 256 \
    --wd 0.0 --lr 0.01 --lr-last-layer 0.2 \
    --syncbn \
    --prefix module.encoder. \
    --labels-perc 10 \
    --pretrained outputs/jigclu_pretrain/model_best.pth.tar \
    --save-dir outputs/jigclu_semi_10p/ \
    datasets/ImageNet/

1% label

python main_semi.py --dist-url 'tcp://localhost:10101' --multiprocessing-distributed --world-size 1 --rank 0 \
    -a resnet50 \
    --batch-size 256 \
    --wd 0.0 --lr 0.02 --lr-last-layer 5.0 \
    --syncbn \
    --prefix module.encoder. \
    --labels-perc 1 \
    --pretrained outputs/jigclu_pretrain/model_best.pth.tar \
    --save-dir outputs/jigclu_semi_1p/ \
    datasets/ImageNet/

Transfer to COCO detection

Please convert the pretrained weight first

python detection/convert.py

Then start training using

python detection/train_net.py --config-file detection/configs/R50-JigClu.yaml --num-gpus 4

VOC detection

python detection/train_net.py --config-file detection/configs/voc-R50-JigClu.yaml --num-gpus 4

Citation

Please consider citing JigsawClustering in your publications if it helps your research.

@inproceedings{chen2021jigclu,
    title={Jigsaw Clustering for Unsupervised Visual Representation Learning},
    author={Pengguang Chen, Shu Liu, and Jiaya Jia},
    booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2021},
}
Comments
  • Some question about trainning

    Some question about trainning

    Hi~Thanks for your excellent work! I have a machine with 2 1080Ti,and I want to train your model on CIFAR10 with resnet18.

    I use the parmeters like this ,but it seems don't work. 1632405015(1)

    The program is stuck in this situation.

    1632405115(1)

    opened by zbw0329 10
  • Some details about the training

    Some details about the training

    Hi, I have recently read your paper and find it very interesting. There are still some confusions about the experiments.

    The experiments require 4 2080ti for training. Does it mean we must have 4 2080ti on one single machine? What if I have 4 2080ti on different machines? Is there any suggestion for this situation? BTW, how long does it take when you train on ImageNet1k?

    Much appreciation for your reply.

    Best wishes!

    opened by Hanzy1996 3
  • Some questions about the results of ImageNet100

    Some questions about the results of ImageNet100

    Thank you for your wonderful work, I want to do some more works based on your code. But I meet some questions about the results. I use the JigsawClustering and the dataset ImageNet100 to train the model. I only changed one line in the model to fit this dataset(I added model.fc = nn.Linear(2048, 100) in line 162 of main_lincls.py). However, despite using 4 GPUs, and did not change the configuration file. I only got an accuracy of 79.24. There is still a certain gap between this and the 80.9 reported in the paper. How can I achieve the accuracy reported in the paper now? Once again, thank you for your excellent work and code. I am looking forward to your reply.

    opened by WilyZhao8 1
  • Results of Faster-RCNN R50-FPN with model pretrained on ImageNet with standard cross-entropy loss

    Results of Faster-RCNN R50-FPN with model pretrained on ImageNet with standard cross-entropy loss

    Hi, thanks for your work! In Objection Detection, do you apply ResNet-50 model pretrained on ImageNet with standard cross-entropy loss to Faster-RCNN R50-FPN?

    opened by fzfs 1
  • Training the model on a single GPU

    Training the model on a single GPU

    Hi! I'm aware that the question has been asked previously, but could you guide how to modify jigclu to remove the distributeddataparallel depedency?

    Thanks!

    opened by shuvam-creditmate 2
  • It seems that the model has not learned anything,What should I do?

    It seems that the model has not learned anything,What should I do?

    Thanks for your excellent work! I change the dataloader to use JigClu in CIFAR-10,and train the model on it by 1000epoch. But the prediction of my model is all the same. It seem that model always cluster into the same cluster

    opened by zbw0329 10
Releases(1.0)
Owner
DV Lab
Deep Vision Lab
DV Lab
Object DGCNN and DETR3D, Our implementations are built on top of MMdetection3D.

This repo contains the implementations of Object DGCNN (https://arxiv.org/abs/2110.06923) and DETR3D (https://arxiv.org/abs/2110.06922). Our implementations are built on top of MMdetection3D.

Wang, Yue 539 Jan 07, 2023
Exploit ILP to learn symmetry breaking constraints of ASP programs.

ILP Symmetry Breaking Overview This project aims to exploit inductive logic programming to lift symmetry breaking constraints of ASP programs. Given a

Research Group Production Systems 1 Apr 13, 2022
Multi Camera Calibration

Multi Camera Calibration 'modules/camera_calibration/app/camera_calibration.cpp' is for calculating extrinsic parameter of each individual cameras. 'm

7 Dec 01, 2022
YolactEdge: Real-time Instance Segmentation on the Edge

YolactEdge, the first competitive instance segmentation approach that runs on small edge devices at real-time speeds. Specifically, YolactEdge runs at up to 30.8 FPS on a Jetson AGX Xavier (and 172.7

Haotian Liu 1.1k Jan 06, 2023
An image processing project uses Viola-jones technique to detect faces and then use SIFT algorithm for recognition.

Attendance_System An image processing project uses Viola-jones technique to detect faces and then use LPB algorithm for recognition. Face Detection Us

8 Jan 11, 2022
MLSpace: Hassle-free machine learning & deep learning development

MLSpace: Hassle-free machine learning & deep learning development

abhishek thakur 293 Jan 03, 2023
Deep Learning (with PyTorch)

Deep Learning (with PyTorch) This notebook repository now has a companion website, where all the course material can be found in video and textual for

Alfredo Canziani 6.2k Jan 07, 2023
Chatbot in 200 lines of code using TensorLayer

Seq2Seq Chatbot This is a 200 lines implementation of Twitter/Cornell-Movie Chatbot, please read the following references before you read the code: Pr

TensorLayer Community 820 Dec 17, 2022
Paddle pit - Rethinking Spatial Dimensions of Vision Transformers

基于Paddle实现PiT ——Rethinking Spatial Dimensions of Vision Transformers,arxiv 官方原版代

Hongtao Wen 4 Jan 15, 2022
Full Resolution Residual Networks for Semantic Image Segmentation

Full-Resolution Residual Networks (FRRN) This repository contains code to train and qualitatively evaluate Full-Resolution Residual Networks (FRRNs) a

Toby Pohlen 274 Oct 27, 2022
Testing and Estimation of structural breaks in Stata

xtbreak estimating and testing for many known and unknown structural breaks in time series and panel data. For an overview of xtbreak test see xtbreak

Jan Ditzen 13 Jun 19, 2022
PyTorch implementation for ComboGAN

ComboGAN This is our ongoing PyTorch implementation for ComboGAN. Code was written by Asha Anoosheh (built upon CycleGAN) [ComboGAN Paper] If you use

Asha Anoosheh 139 Dec 20, 2022
Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation This repository contains the official PyTorch implementation of the following

Wonjong Jang 270 Dec 30, 2022
Download files from DSpace systems (because for some reason DSpace won't let you)

DSpaceDL A tool for downloading files from DSpace items. For some reason, DSpace systems have a dogshit UI, and Universities absolutely LOOOVE to use

Soumitra Shewale 5 Dec 01, 2022
Official repository for Fourier model that can generate periodic signals

Conditional Generation of Periodic Signals with Fourier-Based Decoder Jiyoung Lee, Wonjae Kim, Daehoon Gwak, Edward Choi This repository provides offi

8 May 25, 2022
This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression Introduction In this paper, we are interested in the bottom-up paradigm of estima

HRNet 367 Dec 27, 2022
PyTorch implementation of the paper Ultra Fast Structure-aware Deep Lane Detection

PyTorch implementation of the paper Ultra Fast Structure-aware Deep Lane Detection

1.4k Jan 06, 2023
An Approach to Explore Logistic Regression Models

User-centered Regression An Approach to Explore Logistic Regression Models This tool applies the potential of Attribute-RadViz in identifying correlat

0 Nov 12, 2021
Code for pre-training CharacterBERT models (as well as BERT models).

Pre-training CharacterBERT (and BERT) This is a repository for pre-training BERT and CharacterBERT. DISCLAIMER: The code was largely adapted from an o

Hicham EL BOUKKOURI 31 Dec 05, 2022