Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)

Related tags

Deep Learningda-sac
Overview

Self-supervised Augmentation Consistency
for Adapting Semantic Segmentation

License PyTorch

This repository contains the official implementation of our paper:

Self-supervised Augmentation Consistency for Adapting Semantic Segmentation
Nikita Araslanov and Stefan Roth
To appear at CVPR 2021. [arXiv preprint]

drawing

We obtain state-of-the-art accuracy of adapting semantic
segmentation by enforcing consistency across photometric
and similarity transformations. We use neither style transfer
nor adversarial training.

Contact: Nikita Araslanov fname.lname (at) visinf.tu-darmstadt.de


Installation

Requirements. To reproduce our results, we recommend Python >=3.6, PyTorch >=1.4, CUDA >=10.0. At least two Titan X GPUs (12Gb) or equivalent are required for VGG-16; ResNet-101 and VGG-16/FCN need four.

  1. create conda environment:
conda create --name da-sac
source activate da-sac
  1. install PyTorch >=1.4 (see PyTorch instructions). For example,
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
  1. install the dependencies:
pip install -r requirements.txt
  1. download data (Cityscapes, GTA5, SYNTHIA) and create symlinks in the ./data folder, as follows:
./data/cityscapes -> <symlink to Cityscapes>
./data/cityscapes/gtFine2/
./data/cityscapes/leftImg8bit/

./data/game -> <symlink to GTA>
./data/game/labels_cs
./data/game/images

./data/synthia  -> <symlink to SYNTHIA>
./data/synthia/labels_cs
./data/synthia/RGB

Note that all ground-truth label IDs (Cityscapes, GTA5 and SYNTHIA) should be converted to Cityscapes train IDs. The label directories in the above example (gtFine2, labels_cs) therefore refer not to the original labels, but to these converted semantic maps.

Training

Training from ImageNet initialisation proceeds in three steps:

  1. Training the baseline (ABN)
  2. Generating the weights for importance sampling
  3. Training with augmentation consistency from the ABN baseline

1. Training the baseline (ABN)

Here the input are ImageNet models available from the official PyTorch repository. We provide the links to those models for convenience.

Backbone Link
ResNet-101 resnet101-5d3b4d8f.pth (171M)
VGG-16 vgg16_bn-6c64b313.pth (528M)

By default, these models should be placed in ./models/pretrained/ (though configurable with MODEL.INIT_MODEL).

To run the training

bash ./launch/train.sh [gta|synthia] [resnet101|vgg16|vgg16fcn] base

where the first argument specifies the source domain, the second determines the network architecture. The third argument base instructs to run the training of the baseline.

If you would like to skip this step, you can use our pre-trained models:

Source domain: GTA5

Backbone Arch. IoU (val) Link MD5
ResNet-101 DeepLabv2 40.8 baseline_abn_e040.pth (336M) 9fe17[...]c11fc
VGG-16 DeepLabv2 37.1 baseline_abn_e115.pth (226M) d4ffc[...]ef755
VGG-16 FCN 36.7 baseline_abn_e040.pth (1.1G) aa2e9[...]bae53

Source domain: SYNTHIA

Backbone Arch. IoU (val) Link MD5
ResNet-101 DeepLabv2 36.3 baseline_abn_e090.pth (336M) b3431[...]d1a83
VGG-16 DeepLabv2 34.4 baseline_abn_e070.pth (226M) 3af24[...]5b24e
VGG-16 FCN 31.6 baseline_abn_e040.pth (1.1G) 5f457[...]e4b3a

Tip: You can download these files (as well as the final models below) with tools/download_baselines.sh:

cp tools/download_baselines.sh snapshots/cityscapes/baselines/
cd snapshots/cityscapes/baselines/
bash ./download_baselines.sh

2. Generating weights for importance sampling

To generate the weights you need to

  1. generate mask predictions with your baseline (see inference below);
  2. run tools/compute_image_weights.py that reads in those predictions and counts the predictions per each class.

If you would like to skip this step, you can use our weights we computed for the ABN baselines above:

Backbone Arch. Source: GTA5 Source: SYNTHIA
ResNet-101 DeepLabv2 cs_weights_resnet101_gta.data cs_weights_resnet101_synthia.data
VGG-16 DeepLabv2 cs_weights_vgg16_gta.data cs_weights_vgg16_synthia.data
VGG-16 FCN cs_weights_vgg16fcn_gta.data cs_weights_vgg16fcn_synthia.data

Tip: The bash script data/download_weights.sh will download all these importance sampling weights in the current directory.

3. Training with augmentation consistency

To train the model with augmentation consistency, we use the same shell script as in step 1, but without the argument base:

bash ./launch/train.sh [gta|synthia] [resnet101|vgg16|vgg16fcn]

Make sure to specify your baseline snapshot with RESUME bash variable set in the environment (export RESUME=...) or directly in the shell script (commented out by default).

We provide our final models for download.

Source domain: GTA5

Backbone Arch. IoU (val) IoU (test) Link MD5
ResNet-101 DeepLabv2 53.8 55.7 final_e136.pth (504M) 59c16[...]5a32f
VGG-16 DeepLabv2 49.8 51.0 final_e184.pth (339M) 0accb[...]d5881
VGG-16 FCN 49.9 50.4 final_e112.pth (1.6G) e69f8[...]f729b

Source domain: SYNTHIA

Backbone Arch. IoU (val) IoU (test) Link MD5
ResNet-101 DeepLabv2 52.6 52.7 final_e164.pth (504M) a7682[...]db742
VGG-16 DeepLabv2 49.1 48.3 final_e164.pth (339M) c5b31[...]5fdb7
VGG-16 FCN 46.8 45.8 final_e098.pth (1.6G) efb74[...]845cc

Inference and evaluation

Inference

To run single-scale inference from your snapshot, use infer_val.py. The bash script launch/infer_val.sh provides an easy way to run the inference by specifying a few variables:

# validation/training set
FILELIST=[val_cityscapes|train_cityscapes] 
# configuration used for training
CONFIG=configs/[deeplabv2_vgg16|deeplab_resnet101|fcn_vgg16]_train.yaml
# the following 3 variables effectively specify the path to the snapshot
EXP=...
RUN_ID=...
SNAPSHOT=...
# the snapshot path is defined as
# SNAPSHOT_PATH=snapshots/cityscapes/${EXP}/${RUN_ID}/${SNAPSHOT}.pth

Evaluation

Please use the Cityscapes' official evaluation tool evalPixelLevelSemanticLabeling from Cityscapes scripts for evaluating your results.

Citation

We hope you find our work useful. If you would like to acknowledge it in your project, please use the following citation:

@inproceedings{Araslanov:2021:DASAC,
  title     = {Self-supervised Augmentation Consistency for Adapting Semantic Segmentation},
  author    = {Araslanov, Nikita and and Roth, Stefan},
  booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2021}
}
Owner
Visual Inference Lab @TU Darmstadt
Visual Inference Lab @TU Darmstadt
Code of paper: "DropAttack: A Masked Weight Adversarial Training Method to Improve Generalization of Neural Networks"

DropAttack: A Masked Weight Adversarial Training Method to Improve Generalization of Neural Networks Abstract: Adversarial training has been proven to

倪仕文 (Shiwen Ni) 58 Nov 10, 2022
A disassembler for the RP2040 Programmable I/O State-machine!

piodisasm A disassembler for the RP2040 Programmable I/O State-machine! Usage Just run piodisasm.py on a file that contains the PIO code as hex! (Such

Ghidra Ninja 29 Dec 06, 2022
You can draw the corresponding bounding box into the image and save it according to the result file (txt format) run by the tracker.

You can draw the corresponding bounding box into the image and save it according to the result file (txt format) run by the tracker.

Huiyiqianli 42 Dec 06, 2022
[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception

Versatile Multi-Modal Pre-Training for Human-Centric Perception Fangzhou Hong1  Liang Pan1  Zhongang Cai1,2,3  Ziwei Liu1* 1S-Lab, Nanyang Technologic

Fangzhou Hong 96 Jan 03, 2023
Codebase for "ProtoAttend: Attention-Based Prototypical Learning."

Codebase for "ProtoAttend: Attention-Based Prototypical Learning." Authors: Sercan O. Arik and Tomas Pfister Paper: Sercan O. Arik and Tomas Pfister,

47 2 May 17, 2022
A Python-based development platform for automated trading systems - from backtesting to optimisation to livetrading.

AutoTrader AutoTrader is Python-based platform intended to help in the development, optimisation and deployment of automated trading systems. From sim

Kieran Mackle 485 Jan 09, 2023
DeepLab resnet v2 model in pytorch

pytorch-deeplab-resnet DeepLab resnet v2 model implementation in pytorch. The architecture of deepLab-ResNet has been replicated exactly as it is from

Isht Dwivedi 601 Dec 22, 2022
Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"

This is the codebase for the paper: Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs Directory Structur

Peter Hase 19 Aug 21, 2022
Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.

Deformable Butterfly: A Highly Structured and Sparse Linear Transform DeBut Advantages DeBut generalizes the square power of two butterfly factor matr

Rui LIN 8 Jun 10, 2022
Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020).

SentiBERT Code for SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics (ACL'2020). https://arxiv.org/abs/20

Da Yin 66 Aug 13, 2022
MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python

MNE-Python MNE-Python software is an open-source Python package for exploring, visualizing, and analyzing human neurophysiological data such as MEG, E

MNE tools for MEG and EEG data analysis 2.1k Dec 28, 2022
Code for "Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search"

Contextual Non-Local Alignment over Full-Scale Representation for Text-Based Person Search This is an implementation for our paper Contextual Non-Loca

Tencent YouTu Research 50 Dec 03, 2022
Object detection GUI based on PaddleDetection

PP-Tracking GUI界面测试版 本项目是基于飞桨开源的实时跟踪系统PP-Tracking开发的可视化界面 在PaddlePaddle中加入pyqt进行GUI页面研发,可使得整个训练过程可视化,并通过GUI界面进行调参,模型预测,视频输出等,通过多种类型的识别,简化整体预测流程。 GUI界面

杨毓栋 68 Jan 02, 2023
Bayes-Newton—A Gaussian process library in JAX, with a unifying view of approximate Bayesian inference as variants of Newton's algorithm.

Bayes-Newton Bayes-Newton is a library for approximate inference in Gaussian processes (GPs) in JAX (with objax), built and actively maintained by Wil

AaltoML 165 Nov 27, 2022
Repo for our ICML21 paper Unsupervised Learning of Visual 3D Keypoints for Control

Unsupervised Learning of Visual 3D Keypoints for Control [Project Website] [Paper] Boyuan Chen1, Pieter Abbeel1, Deepak Pathak2 1UC Berkeley 2Carnegie

Boyuan Chen 34 Jul 22, 2022
A PyTorch Implementation of FaceBoxes

FaceBoxes in PyTorch By Zisian Wong, Shifeng Zhang A PyTorch implementation of FaceBoxes: A CPU Real-time Face Detector with High Accuracy. The offici

Zi Sian Wong 797 Dec 17, 2022
[2021 MultiMedia] CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval

CONQUER: Contexutal Query-aware Ranking for Video Corpus Moment Retreival PyTorch implementation of CONQUER: Contexutal Query-aware Ranking for Video

Hou zhijian 23 Dec 26, 2022
Repository for reproducing `Model-Based Robust Deep Learning`

Model-Based Robust Deep Learning (MBRDL) In this repository, we include the code necessary for reproducing the code used in Model-Based Robust Deep Le

Alex Robey 16 Sep 19, 2022