Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Last update: Dec 15, 2022

Overview

Learning Pixel-level Semantic Affinity with Image-level Supervision

This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead.

Introduction

The code and trained models of:

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, Jiwoon Ahn and Suha Kwak, CVPR 2018 [Paper]

We have developed a framework based on AffinityNet to generate accurate segmentation labels of training images given their image-level class labels only. A segmentation network learned with our synthesized labels outperforms previous state-of-the-arts by large margins on the PASCAL VOC 2012.

*Our code was first implemented in Tensorflow at the time of CVPR 2018 submssion, and later we migrated to PyTorch. Some trivial details (optimizer, channel size, and etc.) have been changed.

Citation

If you find the code useful, please consider citing our paper using the following BibTeX entry.

@InProceedings{Ahn_2018_CVPR,
author = {Ahn, Jiwoon and Kwak, Suha},
title = {Learning Pixel-Level Semantic Affinity With Image-Level Supervision for Weakly Supervised Semantic Segmentation},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}

Prerequisite

Tested on Ubuntu 16.04, with Python 3.5, PyTorch 0.4, Torchvision 0.2.1, CUDA 9.0, and 1x NVIDIA TITAN X (Pascal).
The PASCAL VOC 2012 development kit: You also need to specify the path ('voc12_root') of your downloaded dev kit.
(Optional) If you want to try with the VGG-16 based network, PyCaffe and VGG-16 ImageNet pretrained weights [vgg16_20M.caffemodel]
(Optional) If you want to try with the ResNet-38 based network, Mxnet and ResNet-38 pretrained weights [ilsvrc-cls_rna-a1_cls1000_ep-0001.params]

Usage

1. Train a classification network to get CAMs.

python3 train_cls.py --lr 0.1 --batch_size 16 --max_epoches 15 --crop_size 448 --network [network.vgg16_cls | network.resnet38_cls] --voc12_root [your_voc12_root_folder] --weights [your_weights_file] --wt_dec 5e-4

2. Generate labels for AffinityNet by applying dCRF on CAMs.

python3 infer_cls.py --infer_list voc12/train_aug.txt --voc12_root [your_voc12_root_folder] --network [network.vgg16_cls | network.resnet38_cls] --weights [your_weights_file] --out_cam [desired_folder] --out_la_crf [desired_folder] --out_ha_crf [desired_folder]

(Optional) Check the accuracy of CAMs.

python3 infer_cls.py --infer_list voc12/val.txt --voc12_root [your_voc12_root_folder] --network network.resnet38_cls --weights res38_cls.pth --out_cam_pred [desired_folder]

3. Train AffinityNet with the labels

python3 train_aff.py --lr 0.1 --batch_size 8 --max_epoches 8 --crop_size 448 --voc12_root [your_voc12_root_folder] --network [network.vgg16_aff | network.resnet38_aff] --weights [your_weights_file] --wt_dec 5e-4 --la_crf_dir [your_output_folder] --ha_crf_dir [your_output_folder]

4. Perform Random Walks on CAMs

python3 infer_aff.py --infer_list [voc12/val.txt | voc12/train.txt] --voc12_root [your_voc12_root_folder] --network [network.vgg16_aff | network.resnet38_aff] --weights [your_weights_file] --cam_dir [your_output_folder] --out_rw [desired_folder]

Results and Trained Models

Class Activation Map

Model	Train (mIoU)	Val (mIoU)
VGG-16	48.9	46.6	[Weights]
ResNet-38	47.7	47.2	[Weights]
ResNet-38	48.0	46.8	CVPR submission

Random Walk with AffinityNet

Model	alpha	Train (mIoU)	Val (mIoU)
VGG-16	4/16/32	59.6	54.0	[Weights]
ResNet-38	4/16/32	61.0	60.2	[Weights]
ResNet-38	4/16/24	58.1	57.0	CVPR submission

*beta=8, gamma=5, t=256 for all settings

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Related tags

Overview

Learning Pixel-level Semantic Affinity with Image-level Supervision

Introduction

Citation

Prerequisite

Usage

1. Train a classification network to get CAMs.

2. Generate labels for AffinityNet by applying dCRF on CAMs.

(Optional) Check the accuracy of CAMs.

3. Train AffinityNet with the labels

4. Perform Random Walks on CAMs

Results and Trained Models

Class Activation Map

Random Walk with AffinityNet

Owner

Jiwoon Ahn

Implementation of the state of the art beat-detection, downbeat-detection and tempo-estimation model

Machine Learning Toolkit for Kubernetes

Code release for Universal Domain Adaptation(CVPR 2019)

GAN-STEM-Conv2MultiSlice - Exploring Generative Adversarial Networks for Image-to-Image Translation in STEM Simulation

(ICCV 2021 Oral) Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation.

A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

ICSS - Interactive Continual Semantic Segmentation

Codebase for arXiv preprint "NeRF++: Analyzing and Improving Neural Radiance Fields"

Detail-Preserving Transformer for Light Field Image Super-Resolution

UniFormer - official implementation of UniFormer

Learned image compression

A symbolic-model-guided fuzzer for TLS

Integrated physics-based and ligand-based modeling.

Coursera - Quiz & Assignment of Coursera

Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

A framework for attentive explainable deep learning on tabular data

Implements an infinite sum of poisson-weighted convolutions

Conditional Generative Adversarial Networks (CGAN) for Mobility Data Fusion

Dataloader tools for language modelling

Related resources for our EMNLP 2021 paper