(ICCV 2021 Oral) Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation.

Last update: Jan 01, 2023

Related tags

Deep Learning DARS

Overview

DARS

Code release for the paper "Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation", ICCV 2021 (oral).

Authors: Ruifei He*, Jihan Yang*, Xiaojuan Qi (*equal contribution)

arxiv

Usage

Install

Clone this repo:

git clone https://https://github.com/CVMI-Lab/DARS.git
cd DARS

Create a conda virtual environment and activate it:

conda create -n DARS python=3.7 -y
conda activate DARS

Install CUDA==10.1 with cudnn7 following the official installation instructions
Install PyTorch==1.7.1 and torchvision==0.8.2 with CUDA==10.1:

conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.1 -c pytorch

Install Apex:

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Install other requirements:

pip install opencv-python==4.4.0.46 tensorboardX pyyaml

Initialization weights

For PSPNet50, we follow PyTorch Semantic Segmentation and use Imagenet pre-trained weights, which could be found here.

For Deeplabv2, we follow the exact same settings in semisup-semseg, AdvSemiSeg and use Imagenet pre-trained weights.

mkdir initmodel  
# Put the initialization weights under this folder. 
# You can check model/pspnet.py or model/deeplabv2.py.

Data preparation

mkdir dataset  # put the datasets under this folder. You can verify the data path in config files.

Cityscapes

Download the dataset from the Cityscapes dataset server(Link). Download the files named 'gtFine_trainvaltest.zip', 'leftImg8bit_trainvaltest.zip' and extract in dataset/cityscapes/.

For data split, we randomly split the 2975 training samples into 1/8, 7/8 and 1/4 and 3/4. The generated lists are provided in the data_split folder.

Note that since we define an epoch as going through all the samples in the unlabeled data and a batch consists of half labeled and half unlabeled, we repeat the shorter list (labeled list) to the length of the corresponding unlabeled list for convenience.

You can generate random split lists by yourself or use the ones that we provided. You should put them under dataset/cityscapes/list/.

PASCAL VOC 2012

The PASCAL VOC 2012 dataset we used is the commonly used 10582 training set version. If you are unfamiliar with it, please refer to this blog.

For data split, we use the official 1464 training images as labeled data and the 9k augmented set as unlabeled data. We also repeat the labeled list to match that of the unlabeled list.

You should also put the lists under dataset/voc2012/list/.

Training

The config files are located within config folder.

For PSPNet50, crop size 713 requires at least 4*16G GPUs or 8*10G GPUs, and crop size 361 requires at least 1*16G GPU or 2*10G GPUs.

For Deeplabv2, crop size 361 requires at least 1*16G GPU or 2*10G GPUs.

Please adjust the GPU settings in the config files ('train_gpu' and 'test_gpu') according to your machine setup.

The generation of pseudo labels would require 200G usage of disk space, reducing to only 600M after they are generated.

All training scripts for pspnet50 and deeplabv2 are in the tool/scripts folder. For example, to train PSPNet50 for the Cityscapes 1/8 split setting with crop size 713x713, use the following command:

sh tool/scripts/train_psp50_cityscapes_split8_crop713.sh

Acknowledgement

Our code is largely based on PyTorch Semantic Segmentation, and we thank the authors for their wonderful implementation.

We also thank the open-source code from semisup-semseg, AdvSemiSeg, DST-CBC.

Citation

If you find this project useful in your research, please consider cite:

@inproceedings{he2021re,
  title={Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation},
  author={He, Ruifei and Yang, Jihan and Qi, Xiaojuan},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={6930--6940},
  year={2021}
}

(ICCV 2021 Oral) Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation.

Related tags

Overview

DARS

Usage

Install

Initialization weights

Data preparation

Cityscapes

PASCAL VOC 2012

Training

Acknowledgement

Citation

Owner

CVMI Lab

Implementation for Paper "Inverting Generative Adversarial Renderer for Face Reconstruction"

This reporistory contains the test-dev data of the paper "xGQA: Cross-lingual Visual Question Answering".

"NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search".

Running AlphaFold2 (from ColabFold) in Azure Machine Learning

Rotation Robust Descriptors

Serve TensorFlow ML models with TF-Serving and then create a Streamlit UI to use them

An educational resource to help anyone learn deep reinforcement learning.

Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

A high performance implementation of HDBSCAN clustering.

[AAAI 2022] Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification

The final project of "Applying AI to 3D Medical Imaging Data" from "AI for Healthcare" nanodegree - Udacity.

Transparent Transformer Segmentation

Code of TIP2021 Paper《SFace: Sigmoid-Constrained Hypersphere Loss for Robust Face Recognition》. We provide both MxNet and Pytorch versions.

MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera

A simple, high level, easy-to-use open source Computer Vision library for Python.

GalaXC: Graph Neural Networks with Labelwise Attention for Extreme Classification

Lighting the Darkness in the Deep Learning Era: A Survey, An Online Platform, A New Dataset

[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.

Codes to calculate solar-sensor zenith and azimuth angles directly from hyperspectral images collected by UAV. Works only for UAVs that have high resolution GNSS/IMU unit.

VLGrammar: Grounded Grammar Induction of Vision and Language