Official Implementation of DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation

Related tags

Deep LearningDAFormer
Overview

DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation

[Arxiv] [Paper]

As acquiring pixel-wise annotations of real-world images for semantic segmentation is a costly process, a model can instead be trained with more accessible synthetic data and adapted to real images without requiring their annotations. This process is studied in Unsupervised Domain Adaptation (UDA).

Even though a large number of methods propose new UDA strategies, they are mostly based on outdated network architectures. In this work, we particularly study the influence of the network architecture on UDA performance and propose DAFormer, a network architecture tailored for UDA. It consists of a Transformer encoder and a multi-level context-aware feature fusion decoder.

DAFormer is enabled by three simple but crucial training strategies to stabilize the training and to avoid overfitting the source domain: While the Rare Class Sampling on the source domain improves the quality of pseudo-labels by mitigating the confirmation bias of self-training towards common classes, the Thing-Class ImageNet Feature Distance and a Learning Rate Warmup promote feature transfer from ImageNet pretraining.

DAFormer significantly improves the state-of-the-art performance by 10.8 mIoU for GTA→Cityscapes and by 5.4 mIoU for Synthia→Cityscapes and enables learning even difficult classes such as train, bus, and truck well.

UDA over time

The strengths of DAFormer, compared to the previous state-of-the-art UDA method ProDA, can also be observed in qualitative examples from the Cityscapes validation set.

Demo Color Palette

For more information on DAFormer, please check our [Paper].

If you find this project useful in your research, please consider citing:

@article{hoyer2021daformer,
  title={DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation},
  author={Hoyer, Lukas and Dai, Dengxin and Van Gool, Luc},
  journal={arXiv preprint arXiv:2111.14887},
  year={2021}
}

Setup Environment

For this project, we used python 3.8.5. We recommend setting up a new virtual environment:

python -m venv ~/venv/daformer
source ~/venv/daformer/bin/activate

In that environment, the requirements can be installed with:

pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
pip install mmcv-full==1.3.7  # requires the other packages to be installed first

Further, please download the MiT weights and a pretrained DAFormer using the following script. If problems occur with the automatic download, please follow the instructions for a manual download within the script.

sh tools/download_checkpoints.sh

All experiments were executed on a NVIDIA RTX 2080 Ti.

Inference Demo

Already as this point, the provided DAFormer model (downloaded by tools/download_checkpoints.sh) can be applied to a demo image:

python -m demo.image_demo demo/demo.png work_dirs/211108_1622_gta2cs_daformer_s0_7f24c/211108_1622_gta2cs_daformer_s0_7f24c.json work_dirs/211108_1622_gta2cs_daformer_s0_7f24c/latest.pth

When judging the predictions, please keep in mind that DAFormer had no access to real-world labels during the training.

Setup Datasets

Cityscapes: Please, download leftImg8bit_trainvaltest.zip and gt_trainvaltest.zip from here and extract them to data/cityscapes.

GTA: Please, download all image and label packages from here and extract them to data/gta.

Synthia: Please, download SYNTHIA-RAND-CITYSCAPES from here and extract it to data/synthia.

The final folder structure should look like this:

DAFormer
├── ...
├── data
│   ├── cityscapes
│   │   ├── leftImg8bit
│   │   │   ├── train
│   │   │   ├── val
│   │   ├── gtFine
│   │   │   ├── train
│   │   │   ├── val
│   ├── gta
│   │   ├── images
│   │   ├── labels
│   ├── synthia
│   │   ├── RGB
│   │   ├── GT
│   │   │   ├── LABELS
├── ...

Data Preprocessing: Finally, please run the following scripts to convert the label IDs to the train IDs and to generate the class index for RCS:

python tools/convert_datasets/gta.py data/gta --nproc 8
python tools/convert_datasets/cityscapes.py data/cityscapes --nproc 8
python tools/convert_datasets/synthia.py data/synthia/ --nproc 8

Training

For convenience, we provide an annotated config file of the final DAFormer. A training job can be launched using:

python run_experiments.py --config configs/daformer/gta2cs_uda_warm_fdthings_rcs_croppl_a999_daformer_mitb5_s0.py

For the experiments in our paper (e.g. network architecture comparison, component ablations, ...), we use a system to automatically generate and train the configs:

python run_experimenty.py --exp <ID>

More information about the available experiments and their assigned IDs, can be found in experiments.py. The generated configs will be stored in configs/generated/.

Testing & Predictions

The provided DAFormer checkpoint trained on GTA->Cityscapes (already downloaded by tools/download_checkpoints.sh) can be tested on the Cityscapes validation set using:

sh test.sh work_dirs/211108_1622_gta2cs_daformer_s0_7f24c

The predictions are saved for inspection to work_dirs/211108_1622_gta2cs_daformer_s0_7f24c/preds and the mIoU of the model is printed to the console. The provided checkpoint should achieve 68.85 mIoU. Refer to the end of work_dirs/211108_1622_gta2cs_daformer_s0_7f24c/20211108_164105.log for more information such as the class-wise IoU.

Similarly, also other models can be tested after the training has finished:

sh test.sh path/to/checkpoint_directory

Framework Structure

This project is based on mmsegmentation version 0.16.0. For more information about the framework structure and the config system, please refer to the mmsegmentation documentation and the mmcv documentation.

The most relevant files for DAFormer are:

Acknowledgements

This project is based on the following open-source projects. We thank their authors for making the source code publically available.

Owner
Lukas Hoyer
Doctoral student at ETH Zurich
Lukas Hoyer
PyTorch implementations of neural network models for keyword spotting

Honk: CNNs for Keyword Spotting Honk is a PyTorch reimplementation of Google's TensorFlow convolutional neural networks for keyword spotting, which ac

Castorini 475 Dec 15, 2022
Python port of R's Comprehensive Dynamic Time Warp algorithm package

Welcome to the dtw-python package Comprehensive implementation of Dynamic Time Warping algorithms. DTW is a family of algorithms which compute the loc

Dynamic Time Warping algorithms 154 Dec 26, 2022
D2LV: A Data-Driven and Local-Verification Approach for Image Copy Detection

Facebook AI Image Similarity Challenge: Matching Track —— Team: imgFp This is the source code of our 3rd place solution to matching track of Image Sim

16 Dec 25, 2022
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with ONNX, TensorRT, ncnn, and OpenVINO supported.

Introduction YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and ind

7.7k Jan 03, 2023
ADOP: Approximate Differentiable One-Pixel Point Rendering

ADOP: Approximate Differentiable One-Pixel Point Rendering Abstract: We present a novel point-based, differentiable neural rendering pipeline for scen

Darius Rückert 1.9k Jan 06, 2023
OpenCV, MediaPipe Pose Estimation, Affine Transform for Icon Overlay

Yoga Pose Identification and Icon Matching Project Goal Detect yoga poses performed by a user and overlay a corresponding icon image. Running the main

Anna Garverick 1 Dec 03, 2021
Deep learning models for classification of 15 common weeds in the southern U.S. cotton production systems.

CottonWeeds Deep learning models for classification of 15 common weeds in the southern U.S. cotton production systems. requirements pytorch torchsumma

Dong Chen 8 Jun 07, 2022
A tensorflow model that predicts if the image is of a cat or of a dog.

Quick intro Hello and thank you for your interest in my project! This is the backend part of a two-repo application. The other part can be found here

Tudor Matei 0 Mar 08, 2022
(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

RDPNet IEEE TIP 2021: Regularized Densely-connected Pyramid Network for Salient Instance Segmentation PyTorch training and testing code are available.

Yu-Huan Wu 41 Oct 21, 2022
Scripts used to make and evaluate OpenAlex's concept tagging model

openalex-concept-tagging This repository contains all of the code for getting the concept tagger up and running. To learn more about where this model

OurResearch 18 Dec 09, 2022
Simple reference implementation of GraphSAGE.

Reference PyTorch GraphSAGE Implementation Author: William L. Hamilton Basic reference PyTorch implementation of GraphSAGE. This reference implementat

William L Hamilton 861 Jan 06, 2023
MODNet: Trimap-Free Portrait Matting in Real Time

MODNet is a model for real-time portrait matting with only RGB image input.

Zhanghan Ke 2.8k Dec 30, 2022
This repo is about implementing different approaches of pose estimation and also is a sub-task of the smart hospital bed project :smile:

Pose-Estimation This repo is a sub-task of the smart hospital bed project which is about implementing the task of pose estimation 😄 Many thanks to th

Max 11 Oct 17, 2022
Supporting code for "Autoregressive neural-network wavefunctions for ab initio quantum chemistry".

naqs-for-quantum-chemistry This repository contains the codebase developed for the paper Autoregressive neural-network wavefunctions for ab initio qua

Tom Barrett 24 Dec 23, 2022
[AI6101] Introduction to AI & AI Ethics is a core course of MSAI, SCSE, NTU, Singapore

[AI6101] Introduction to AI & AI Ethics is a core course of MSAI, SCSE, NTU, Singapore. The repository corresponds to the AI6101 of Semester 1, AY2021-2022, starting from 08/2021. The instructors of

AccSrd 1 Sep 22, 2022
Exploration & Research into cross-domain MEV. Initial focus on ETH/POLYGON.

xMEV, an apt exploration This is a small exploration on the xMEV opportunities between Polygon and Ethereum. It's a data analysis exercise on a few pa

odyslam.eth 7 Oct 18, 2022
Motion and Shape Capture from Sparse Markers

MoSh++ This repository contains the official chumpy implementation of mocap body solver used for AMASS: AMASS: Archive of Motion Capture as Surface Sh

Nima Ghorbani 135 Dec 23, 2022
Attack on Confidence Estimation algorithm from the paper "Disrupting Deep Uncertainty Estimation Without Harming Accuracy"

Attack on Confidence Estimation (ACE) This repository is the official implementation of "Disrupting Deep Uncertainty Estimation Without Harming Accura

3 Mar 30, 2022
A small demonstration of using WebDataset with ImageNet and PyTorch Lightning

A small demonstration of using WebDataset with ImageNet and PyTorch Lightning

Tom 50 Dec 16, 2022
Implementation of the ICCV'21 paper Temporally-Coherent Surface Reconstruction via Metric-Consistent Atlases

Temporally-Coherent Surface Reconstruction via Metric-Consistent Atlases [Papers 1, 2][Project page] [Video] The implementation of the papers Temporal

56 Nov 21, 2022