Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

Last update: Dec 27, 2022

Related tags

Deep Learning image-segmentation

Overview

CCAM (Unsupervised)

Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation" in CVPR 2022.

The repository includes full training, evaluation, and visualization codes on CUB-200-2011, ILSVRC2012, and PASCAL VOC2012 datasets.

We provide the extracted class-agnostic bounding boxes (on CUB-200-2011 and ILSVRC2012) and background cues (on PASCAL VOC12) at here.

Dependencies

Python 3
PyTorch 1.7.1
OpenCV-Python
Numpy
Scipy
MatplotLib
Yaml
Easydict

Dataset

CUB-200-2011

You will need to download the images (JPEG format) in CUB-200-2011 dataset at here. Make sure your data/CUB_200_2011 folder is structured as follows:

├── CUB_200_2011/
|   ├── images
|   ├── images.txt
|   ├── bounding_boxes.txt
|   ...
|   └── train_test_split.txt

You will need to download the images (JPEG format) in ILSVRC2012 dataset at here. Make sure your data/ILSVRC2012 folder is structured as follows:

ILSVRC2012

├── ILSVRC2012/ 
|   ├── train
|   ├── val
|   ├── val_boxes
|   |   ├——val
|   |   |   ├—— ILSVRC2012_val_00050000.xml
|   |   |   ├—— ...
|   ├── train.txt
|   └── val.txt

PASCAL VOC2012

You will need to download the images (JPEG format) in PASCAL VOC2012 dataset at here. Make sure your data/VOC2012 folder is structured as follows:

├── VOC2012/
|   ├── Annotations
|   ├── ImageSets
|   ├── SegmentationClass
|   ├── SegmentationClassAug
|   └── SegmentationObject

For WSOL task

please refer to the directory of './WSOL'

cd WSOL

For WSSS task

please refer to the directory of './WSSS'

cd WSSS

Comparison with CAM

CUSTOM DATASET

As CCAM is an unsupervised method, it can be applied to various scenarios, like ReID, Saliency detection, or skin lesion detection. We provide an example to apply CCAM on your custom dataset like 'Market-1501'.

cd CUSTOM

Reference

If you are using our code, please consider citing our paper.

@article{xie2022contrastive,
  title={Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation},
  author={Xie, Jinheng and Xiang, Jianfeng and Chen, Junliang and Hou, Xianxu and Zhao, Xiaodong and Shen, Linlin},
  journal={arXiv preprint arXiv:2203.13505},
  year={2022}
}

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

Related tags

Overview

CCAM (Unsupervised)

Dependencies

Dataset

CUB-200-2011

ILSVRC2012

PASCAL VOC2012

For WSOL task

For WSSS task

Comparison with CAM

CUSTOM DATASET

Reference

Owner

Computer Vision Insitute, SZU

This is the code of "Multi-view Contrastive Graph Clustering" in NeurlPS 2021.

Deep Learning Emotion decoding using EEG data from Autism individuals

HandFoldingNet ✌️ : A 3D Hand Pose Estimation Network Using Multiscale-Feature Guided Folding of a 2D Hand Skeleton

"Segmenter: Transformer for Semantic Segmentation" reproduced via mmsegmentation

DIP-football - A football video analyse system based on Yolov5, alphapose, Qt6

Apply Graph Self-Supervised Learning methods to graph-level task(TUDataset, MolculeNet Datset)

Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

Teaching end to end workflow of deep learning

The trained model and denoising example for paper : Cardiopulmonary Auscultation Enhancement with a Two-Stage Noise Cancellation Approach

Simple Tensorflow implementation of "Adaptive Convolutions for Structure-Aware Style Transfer" (CVPR 2021)

The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"

PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

Image-Scaling Attacks and Defenses

Monk is a low code Deep Learning tool and a unified wrapper for Computer Vision.

A python module for configuration of block devices

FCA: Learning a 3D Full-coverage Vehicle Camouflage for Multi-view Physical Adversarial Attack

Self-Supervised depth kalilia

A free, multiplatform SDK for real-time facial motion capture using blendshapes, and rigid head pose in 3D space from any RGB camera, photo, or video.

Tensorflow implementation for Self-supervised Graph Learning for Recommendation

Code for WSDM 2022 paper, Contrastive Learning for Representation Degeneration Problem in Sequential Recommendation.