Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

Last update: Dec 12, 2022

Related tags

Overview

Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

The code of:

Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation , Yukun Su, Ruizhou Sun, Guosheng Lin, Qingyao Wu (https://arxiv.org/abs/2103.01795)

Data augmentation is vital for deep learning neural networks. By providing massive training samples, it helps to improve the generalization ability of the model. Weakly supervised semantic segmentation (WSSS) is a challenging problem that has been deeply studied in recent years, conventional data augmentation approaches for WSSS usually employ geometrical transformations, random cropping and color jittering. However, merely increasing the same contextual semantic data does not bring much gain to the networks to distinguish the objects, e.g., the correct image-level classification of “aeroplane” may be not only due to the recognition of the object itself, but also its co-occurrence context like “sky”, which will cause the model to focus less on the object features. To this end, we present a Context Decoupling Augmentation (CDA) method, to change the inherent context in which the objects appear and thus drive the network to remove the dependence between object instances and contextual information. To validate the effectiveness of the proposed method, extensive experiments on PASCAL VOC 2012 dataset with several alternative network architectures demonstrate that CDA can boost various popular WSSS methods to the new state-of-the-art by a large margin.

Thanks to the work of jiwoon-ahn, our work is mainly based on his IRNet respository. Besides, for clarity, we only provide the IRN augmentation code. You can use the same modifications for SEAM and AffinityNet. The model weights are given below.

Citation

If you find the code useful, please consider citing our paper using the following BibTeX entry.

@misc{2103.01795,
Author = {Yukun Su and Ruizhou Sun and Guosheng Lin and Qingyao Wu},
Title = {Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation},
Year = {2021},
Eprint = {arXiv:2103.01795},
}

Prerequisite

Python 3.7, PyTorch 1.1.0, and more in requirements.txt
PASCAL VOC 2012 devkit
NVIDIA GPU with more than 1024MB of memory

Usage

Install python dependencies

pip install -r requirements.txt

Download PASCAL VOC 2012 devkit

Follow instructions in http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#devkit

Run run_sample.py or make your own script

python run_sample.py

You can either mannually edit the file, or specify commandline arguments.

Results and Trained Models

Class Activation Map

Model	Train (mIoU)
ResNet-50 for IRnet	50.8	[Weights]
ResNet-38 for SEAM	58.4	[Weights]
ResNet-38 for AffinityNet	48.9	[Weights]

Pseudo Mask Models

Model	Train (mIoU)
ResNet-50 for IRnet	67.7	[Weights]
ResNet-38 for SEAM	66.4	[Weights]
ResNet-38 for AffinityNet	63.3	[Weights]

References

Ahn, Jiwoon and Cho, Sunghyun and Kwak, Suha. Weakly Supervised Learning of Instance Segmentation with Inter-pixel Relations. CVPR, 2019.
Project / Paper
Yude Wang and Jie Zhang and Meina Kan and Shiguang Shan and Xilin Chen. Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation.CVPR, 2020.
Project / Paper
Ahn, Jiwoon and Kwak, Suha. Learning Pixel-Level Semantic Affinity With Image-Level Supervision for Weakly Supervised Semantic Segmentation.CVPR, 2018.
Project / Paper

Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

Related tags

Overview

Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

Citation

Prerequisite

Usage

Install python dependencies

Download PASCAL VOC 2012 devkit

Run run_sample.py or make your own script

Results and Trained Models

Class Activation Map

Pseudo Mask Models

References

Owner

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker

PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more

Code of the paper "Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition"

A Machine Teaching Framework for Scalable Recognition

Supplementary materials for ISMIR 2021 LBD paper "Evaluation of Latent Space Disentanglement in the Presence of Interdependent Attributes"

🔥 TensorFlow Code for technical report: "YOLOv3: An Incremental Improvement"

3D HourGlass Networks for Human Pose Estimation Through Videos

Classifying cat and dog images using Kaggle dataset

:hot_pepper: R²SQL: "Dynamic Hybrid Relation Network for Cross-Domain Context-Dependent Semantic Parsing." (AAAI 2021)

FairEdit: Preserving Fairness in Graph Neural Networks through Greedy Graph Editing

Sequence Modeling with Structured State Spaces

Energy consumption estimation utilities for Jetson-based platforms

FLSim a flexible, standalone library written in PyTorch that simulates FL settings with a minimal, easy-to-use API

Lama-cleaner: Image inpainting tool powered by LaMa

MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images

Clean Machine Learning, a Coding Kata

Kaggle-titanic - A tutorial for Kaggle's Titanic: Machine Learning from Disaster competition. Demonstrates basic data munging, analysis, and visualization techniques. Shows examples of supervised machine learning techniques.

Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.

EsViT: Efficient self-supervised Vision Transformers

Music library streaming app written in Flask & VueJS