Official Implementation of HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

Last update: Dec 28, 2022

Overview

HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

by Lukas Hoyer, Dengxin Dai, and Luc Van Gool

Overview

Unsupervised domain adaptation (UDA) aims to adapt a model trained on synthetic data to real-world data without requiring expensive annotations of real-world images. As UDA methods for semantic segmentation are usually GPU memory intensive, most previous methods operate only on downscaled images. We question this design as low-resolution predictions often fail to preserve fine details. The alternative of training with random crops of high-resolution images alleviates this problem but falls short in capturing long-range, domain-robust context information.

Therefore, we propose HRDA, a multi-resolution training approach for UDA, that combines the strengths of small high-resolution crops to preserve fine segmentation details and large low-resolution crops to capture long-range context dependencies with a learned scale attention, while maintaining a manageable GPU memory footprint.

HRDA enables adapting small objects and preserving fine segmentation details. It significantly improves the state-of-the-art performance by 5.5 mIoU for GTA→Cityscapes and by 4.9 mIoU for Synthia→Cityscapes, resulting in an unprecedented performance of 73.8 and 65.8 mIoU, respectively.

The more detailed domain-adaptive semantic segmentation of HRDA, compared to the previous state-of-the-art UDA method DAFormer, can also be observed in example predictions from the Cityscapes validation set.

For more information on HRDA, please check our [Paper].

If you find HRDA useful in your research, please consider citing:

@Article{hoyer2022hrda,
  title={{HRDA}: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation},
  author={Hoyer, Lukas and Dai, Dengxin and Van Gool, Luc},
  journal={arXiv preprint arXiv:2204.13132},
  year={2022}
}

Setup Environment

For this project, we used python 3.8.5. We recommend setting up a new virtual environment:

python -m venv ~/venv/hrda
source ~/venv/hrda/bin/activate

In that environment, the requirements can be installed with:

pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
pip install mmcv-full==1.3.7  # requires the other packages to be installed first

Further, please download the MiT weights from SegFormer using the following script. If problems occur with the automatic download, please follow the instructions for a manual download within the script.

sh tools/download_checkpoints.sh

Setup Datasets

Cityscapes: Please, download leftImg8bit_trainvaltest.zip and gt_trainvaltest.zip from here and extract them to data/cityscapes.

GTA: Please, download all image and label packages from here and extract them to data/gta.

Synthia: Please, download SYNTHIA-RAND-CITYSCAPES from here and extract it to data/synthia.

The final folder structure should look like this:

DAFormer
├── ...
├── data
│   ├── cityscapes
│   │   ├── leftImg8bit
│   │   │   ├── train
│   │   │   ├── val
│   │   ├── gtFine
│   │   │   ├── train
│   │   │   ├── val
│   ├── gta
│   │   ├── images
│   │   ├── labels
│   ├── synthia
│   │   ├── RGB
│   │   ├── GT
│   │   │   ├── LABELS
├── ...

Data Preprocessing: Finally, please run the following scripts to convert the label IDs to the train IDs and to generate the class index for RCS:

python tools/convert_datasets/gta.py data/gta --nproc 8
python tools/convert_datasets/cityscapes.py data/cityscapes --nproc 8
python tools/convert_datasets/synthia.py data/synthia/ --nproc 8

Testing & Predictions

The provided HRDA checkpoint trained on GTA->Cityscapes (already downloaded by tools/download_checkpoints.sh) can be tested on the Cityscapes validation set using:

sh test.sh work_dirs/gtaHR2csHR_hrda_246ef

The predictions are saved for inspection to work_dirs/gtaHR2csHR_hrda_246ef/preds and the mIoU of the model is printed to the console. The provided checkpoint should achieve 73.79 mIoU. Refer to the end of work_dirs/gtaHR2csHR_hrda_246ef/20220215_002056.log for more information such as the class-wise IoU.

If you want to visualize the LR predictions, HR predictions, or scale attentions of HRDA on the validation set, please refer to test.sh for further instructions.

Training

For convenience, we provide an annotated config file of the final HRDA. A training job can be launched using:

python run_experiments.py --config configs/hrda/gtaHR2csHR_hrda.py

The logs and checkpoints are stored in work_dirs/.

For the other experiments in our paper, we use a script to automatically generate and train the configs:

python run_experiments.py --exp <ID>

More information about the available experiments and their assigned IDs, can be found in experiments.py. The generated configs will be stored in configs/generated/.

When training a model on Synthia->Cityscapes, please note that the evaluation script calculates the mIoU for all 19 Cityscapes classes. However, Synthia contains only labels for 16 of these classes. Therefore, it is a common practice in UDA to report the mIoU for Synthia->Cityscapes only on these 16 classes. As the Iou for the 3 missing classes is 0, you can do the conversion mIoU16 = mIoU19 * 19 / 16.

Framework Structure

This project is based on mmsegmentation version 0.16.0. For more information about the framework structure and the config system, please refer to the mmsegmentation documentation and the mmcv documentation.

The most relevant files for HRDA are:

configs/hrda/gtaHR2csHR_hrda.py: Annotated config file for the final HRDA.
mmseg/models/segmentors/hrda_encoder_decoder.py: Implementation of the HRDA multi-resolution encoding with context and detail crop.
mmseg/models/decode_heads/hrda_head.py: Implementation of the HRDA decoding with multi-resolution fusion and scale attention.
mmseg/models/uda/dacs.py: Implementation of the DAFormer self-training.

Acknowledgements

HRDA is based on the following open-source projects. We thank their authors for making the source code publicly available.

Official Implementation of HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

Related tags

Overview

HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation

Overview

Setup Environment

Setup Datasets

Testing & Predictions

Training

Framework Structure

Acknowledgements

Owner

Lukas Hoyer

A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

Proof-Of-Concept Piano-Drums Music AI Model/Implementation

DeepProbLog is an extension of ProbLog that integrates Probabilistic Logic Programming with deep learning by introducing the neural predicate.

DumpSMBShare - A script to dump files and folders remotely from a Windows SMB share

Motion and Shape Capture from Sparse Markers

Official implementation of paper "Query2Label: A Simple Transformer Way to Multi-Label Classification".

An official source code for paper Deep Graph Clustering via Dual Correlation Reduction, accepted by AAAI 2022

MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation

Python package for dynamic system estimation of time series

Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)

Code release for Local Light Field Fusion at SIGGRAPH 2019

LightningFSL: Pytorch-Lightning implementations of Few-Shot Learning models.

PyTorch deep learning projects made easy.

A Free and Open Source Python Library for Multiobjective Optimization

Employee-Managment - Company employee registration software in the face recognition system

Simple PyTorch implementations of Badnets on MNIST and CIFAR10.

The PyTorch re-implement of a 3D CNN Tracker to extract coronary artery centerlines with state-of-the-art (SOTA) performance. (paper: 'Coronary artery centerline extraction in cardiac CT angiography using a CNN-based orientation classiﬁer')

DeepMetaHandles: Learning Deformation Meta-Handles of 3D Meshes with Biharmonic Coordinates

This repository contains the reference implementation for our proposed Convolutional CRFs.

This is a Tensorflow implementation of Learning to See in the Dark in CVPR 2018