[CVPR22] Official codebase of Semantic Segmentation by Early Region Proxy.

Overview

RegionProxy

Figure 2. Performance vs. GFLOPs on ADE20K val split.

Semantic Segmentation by Early Region Proxy

Yifan Zhang, Bo Pang, Cewu Lu

CVPR 2022 (Poster) [arXiv]

Installation

Note: recommend using the exact version of the packages to avoid running issues.

  1. Install PyTorch 1.7.1 and torchvision 0.8.2 following the official guide.

  2. Install timm 0.4.12 and einops:

    pip install timm==0.4.12 einops
    
  3. This project depends on mmsegmentation 0.17 and mmcv 1.3.13, so you may follow its instructions to setup environment and prepare datasets.

Models

ADE20K

backbone Resolution FLOPs #params. mIoU mIoU (ms+flip) FPS download
ViT-Ti/16 512x512 3.9G 5.8M 42.1 43.1 38.9 [model]
ViT-S/16 512x512 15G 22M 47.6 48.5 32.1 [model]
R26+ViT-S/32 512x512 16G 36M 47.8 49.1 28.5 [model]
ViT-B/16 512x512 59G 87M 49.8 50.5 20.1 [model]
R50+ViT-L/32 640x640 82G 323M 51.0 51.7 12.7 [model]
ViT-L/16 640x640 326G 306M 52.9 53.4 6.6 [model]

Cityscapes

backbone Resolution FLOPs #params. mIoU mIoU (ms+flip) download
ViT-Ti/16 768x768 69G 6M 76.5 77.7 [model]
ViT-S/16 768x768 270G 23M 79.8 81.5 [model]
ViT-B/16 768x768 1064G 88M 81.0 82.2 [model]
ViT-L/16 768x768 - 307M 81.4 82.7 [model]

Evaluation

You may evaluate the model on single GPU by running:

python test.py \
	--config configs/regproxy_ade20k/regproxy-t16-sub4+implicit-mid-4+512x512+160k+adamw-poly+ade20k.py \
	--checkpoint /path/to/ckpt \
	--eval mIoU

To evaluate on multiple GPUs, run:

python -m torch.distributed.launch --nproc_per_node 8 test.py \
	--launcher pytorch \
	--config configs/regproxy_ade20k/regproxy-t16-sub4+implicit-mid-4+512x512+160k+adamw-poly+ade20k.py \
	--checkpoint /path/to/ckpt 
	--eval mIoU

You may add --aug-test to enable multi-scale + flip evaluation. The test.py script is mostly copy-pasted from mmsegmentation. Please refer to this link for more usage (e.g., visualization).

Training

The first step is to prepare the pre-trained weights. Following Segmenter, we use AugReg pre-trained weights on our tiny, small and large models, and we use DeiT pre-trained weights on our base models. Do following steps to prepare the pre-trained weights for model initialization:

  1. For DeiT weight, simply download from this link. For AugReg weights, first acquire the timm-style models:

    import timm
    m = timm.create_model('vit_tiny_patch16_384', pretrained=True)

    The full list of entries can be found here (vanilla ViTs) and here (hybrid models).

  2. Convert the timm models to mmsegmentation style using this script.

We train all models on 8 V100 GPUs. For example, to train RegProxy-Ti/16, run:

python -m torch.distributed.launch --nproc_per_node 8 train.py 
	--launcher pytorch \
	--config configs/regproxy_ade20k/regproxy-t16-sub4+implicit-mid-4+512x512+160k+adamw-poly+ade20k.py \
	--work-dir /path/to/workdir \
	--options model.pretrained=/path/to/pretrained/model

You may need to adjust data.samples_per_gpu if you plan to train on less GPUs. Please refer to this link for more training optioins.

Citation

@article{zhang2022semantic,
  title={Semantic Segmentation by Early Region Proxy},
  author={Zhang, Yifan and Pang, Bo and Lu, Cewu},
  journal={arXiv preprint arXiv:2203.14043},
  year={2022}
}
Owner
Yifan
Yifan
Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image

CenterPose Overview This repository is the official implementation of the paper "Single-stage Keypoint-based Category-level Object Pose Estimation fro

NVIDIA Research Projects 188 Dec 27, 2022
A script depending on VASP output for calculating Fermi-Softness.

Fermi softness calculation for Vienna Ab initio Simulation Package (VASP) Update 1.1.0: Big update: Rewrote the code. Use Bader atomic division instea

qslin 11 Nov 08, 2022
PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"

Under construction... Attention in Attention Network for Image Super-Resolution (A2N) This repository is an PyTorch implementation of the paper "Atten

Haoyu Chen 71 Dec 30, 2022
A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion

A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion This repo intends to release code for our work: Zhaoyang Lyu*, Zhifeng

Zhaoyang Lyu 68 Jan 03, 2023
Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

Abdultawwab Safarji 7 Nov 27, 2022
DLL: Direct Lidar Localization

DLL: Direct Lidar Localization Summary This package presents DLL, a direct map-based localization technique using 3D LIDAR for its application to aeri

Service Robotics Lab 127 Dec 16, 2022
Trading and Backtesting environment for training reinforcement learning agent or simple rule base algo.

TradingGym TradingGym is a toolkit for training and backtesting the reinforcement learning algorithms. This was inspired by OpenAI Gym and imitated th

Yvictor 1.1k Jan 02, 2023
Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

Yaoming Cai 5 Jul 18, 2022
Official PyTorch implementation of the paper Image-Based CLIP-Guided Essence Transfer.

TargetCLIP- official pytorch implementation of the paper Image-Based CLIP-Guided Essence Transfer This repository finds a global direction in StyleGAN

Hila Chefer 221 Dec 13, 2022
This code provides various models combining dilated convolutions with residual networks

Overview This code provides various models combining dilated convolutions with residual networks. Our models can achieve better performance with less

Fisher Yu 1.1k Dec 30, 2022
Optimizing synthesizer parameters using gradient approximation

Optimizing synthesizer parameters using gradient approximation NASH 2021 Hackathon! These are some experiments I conducted during NASH 2021, the Neura

Jordie Shier 10 Feb 10, 2022
Code repo for "Transformer on a Diet" paper

Transformer on a Diet Reference: C Wang, Z Ye, A Zhang, Z Zhang, A Smola. "Transformer on a Diet". arXiv preprint arXiv (2020). Installation pip insta

cgraywang 31 Sep 26, 2021
An implementation of RetinaNet in PyTorch.

RetinaNet An implementation of RetinaNet in PyTorch. Installation Training COCO 2017 Pascal VOC Custom Dataset Evaluation Todo Credits Installation In

Conner Vercellino 297 Jan 04, 2023
Generate text captions for images from their CLIP embeddings. Includes PyTorch model code and example training script.

clip-text-decoder Generate text captions for images from their CLIP embeddings. Includes PyTorch model code and example training script. Example Predi

Frank Odom 36 Dec 21, 2022
Code for the paper "Regularizing Variational Autoencoder with Diversity and Uncertainty Awareness"

DU-VAE This is the pytorch implementation of the paper "Regularizing Variational Autoencoder with Diversity and Uncertainty Awareness" Acknowledgement

Dazhong Shen 4 Oct 19, 2022
Data and code for the paper "Importance of Kernel Bandwidth in Quantum Machine Learning"

Reproducibility materials for "Importance of Kernel Bandwidth in Quantum Machine Learning" Repo structure: code contains Python scripts used to genera

Ruslan Shaydulin 3 Oct 23, 2022
OpenMMLab Pose Estimation Toolbox and Benchmark.

Introduction English | 简体中文 MMPose is an open-source toolbox for pose estimation based on PyTorch. It is a part of the OpenMMLab project. The master b

OpenMMLab 2.8k Dec 31, 2022
Metrics to evaluate quality and efficacy of synthetic datasets.

An Open Source Project from the Data to AI Lab, at MIT Metrics for Synthetic Data Generation Projects Website: https://sdv.dev Documentation: https://

The Synthetic Data Vault Project 129 Jan 03, 2023
The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift

TwoStageAlign The official codes of our CVPR2022 paper: A Differentiable Two-stage Alignment Scheme for Burst Image Reconstruction with Large Shift Pa

Shi Guo 32 Dec 15, 2022
Convert scikit-learn models to PyTorch modules

sk2torch sk2torch converts scikit-learn models into PyTorch modules that can be tuned with backpropagation and even compiled as TorchScript. Problems

Alex Nichol 101 Dec 16, 2022