Pyramid Scene Parsing Network, CVPR2017.

Last update: Jan 05, 2023

Related tags

Deep Learning PSPNet

Overview

Pyramid Scene Parsing Network

by Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia, details are in project page.

Introduction

This repository is for 'Pyramid Scene Parsing Network', which ranked 1st place in ImageNet Scene Parsing Challenge 2016. The code is modified from Caffe version of DeepLab v2 and yjxiong for evaluation. We merge the batch normalization layer named 'bn_layer' in the former one into the later one while keep the original 'batch_norm_layer' in the later one unchanged for compatibility. The difference is that 'bn_layer' contains four parameters as 'slope,bias,mean,variance' while 'batch_norm_layer' contains two parameters as 'mean,variance'. Several evaluation code is borrowed from MIT Scene Parsing.

PyTorch Version

Highly optimized PyTorch codebases available for semantic segmentation in repo: semseg, including full training and testing codes for PSPNet and PSANet.

Installation

For installation, please follow the instructions of Caffe and DeepLab v2. To enable cuDNN for GPU acceleration, cuDNN v4 is needed. If you meet error related with 'matio', please download and install matio as required in 'DeepLab v2'.

The code has been tested successfully on Ubuntu 14.04 and 12.04 with CUDA 7.0.

Usage

Clone the repository:

git clone https://github.com/hszhao/PSPNet.git

Build Caffe and matcaffe:

cd $PSPNET_ROOT
cp Makefile.config.example Makefile.config
vim Makefile.config
make -j8 && make matcaffe

Evaluation:
- Evaluation code is in folder 'evaluation'.
- Download trained models and put them in folder 'evaluation/model':
  - pspnet50_ADE20K.caffemodel: GoogleDrive
  - pspnet101_VOC2012.caffemodel: GoogleDrive
  - pspnet101_cityscapes.caffemodel: GoogleDrive
- Modify the related paths in 'eval_all.m':
  - Mainly variables 'data_root' and 'eval_list', and your image list for evaluation should be similarity to that in folder 'evaluation/samplelist' if you use this evaluation code structure.
  - Matlab 'parfor' evaluation is used and the default GPUs are with ID [0:3]. Modify variable 'gpu_id_array' if needed. We assume that number of images can be divided by number of GPUs; if not, you can just pad your image list or switch to single GPU evaluation by set 'gpu_id_array' be length of one, and change 'parfor' to 'for' loop.
```
cd evaluation
vim eval_all.m
```
- Run the evaluation scripts:
```
./run.sh
```
Results:

Prediction results will show in folder 'evaluation/mc_result' and the expected scores are:

(single scale testing denotes as 'ss' and multiple scale testing denotes as 'ms')
- PSPNet50 on ADE20K valset (mIoU/pAcc): 41.68/80.04 (ss) and 42.78/80.76 (ms)
- PSPNet101 on VOC2012 testset (mIoU): 85.41 (ms)
- PSPNet101 on cityscapes valset (mIoU/pAcc): 79.70/96.38 (ss) and 80.91/96.59 (ms)
Demo video:

Video processed by PSPNet101 on cityscapes dataset:

Merge with colormap on side: Video1

Alpha blending with value as 0.5: Video2

Citation

If PSPNet is useful for your research, please consider citing:

@inproceedings{zhao2017pspnet,
  title={Pyramid Scene Parsing Network},
  author={Zhao, Hengshuang and Shi, Jianping and Qi, Xiaojuan and Wang, Xiaogang and Jia, Jiaya},
  booktitle={CVPR},
  year={2017}
}

Questions

Please contact '[email protected]'

Pyramid Scene Parsing Network, CVPR2017.

Related tags

Overview

Pyramid Scene Parsing Network

Introduction

PyTorch Version

Installation

Usage

Citation

Questions

Owner

Hengshuang Zhao

This's an implementation of deepmind Visual Interaction Networks paper using pytorch

StyleSwin: Transformer-based GAN for High-resolution Image Generation

a delightful machine learning tool that allows you to train, test and use models without writing code

Image De-raining Using a Conditional Generative Adversarial Network

Implement slightly different caffe-segnet in tensorflow

This repository includes the code of the sequence-to-sequence model for discontinuous constituent parsing described in paper Discontinuous Grammar as a Foreign Language.

MatchGAN: A Self-supervised Semi-supervised Conditional Generative Adversarial Network

The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

LUKE -- Language Understanding with Knowledge-based Embeddings

Simulation-based performance analysis of server-less Blockchain-enabled Federated Learning

Learn other languages using artificial intelligence with python.

[CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

DeepHawkeye is a library to detect unusual patterns in images using features from pretrained neural networks

A Strong Baseline for Image Semantic Segmentation

Human Pose estimation with TensorFlow framework

Optimize Trading Strategies Using Freqtrade

Collection of generative models in Tensorflow

Sample code from the Neural Networks from Scratch book.

Data and codes for ACL 2021 paper: Towards Emotional Support Dialog Systems

The code for 'Deep Residual Fourier Transformation for Single Image Deblurring'

Pyramid Scene Parsing Network, CVPR2017.

Related tags

Overview

Pyramid Scene Parsing Network

Introduction

PyTorch Version

Installation

Usage

Citation

Questions

Owner

Hengshuang Zhao

This's an implementation of deepmind Visual Interaction Networks paper using pytorch

StyleSwin: Transformer-based GAN for High-resolution Image Generation

a delightful machine learning tool that allows you to train, test and use models without writing code

Image De-raining Using a Conditional Generative Adversarial Network

Implement slightly different caffe-segnet in tensorflow

This repository includes the code of the sequence-to-sequence model for discontinuous constituent parsing described in paper Discontinuous Grammar as a Foreign Language.

MatchGAN: A Self-supervised Semi-supervised Conditional Generative Adversarial Network

The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

LUKE -- Language Understanding with Knowledge-based Embeddings

Simulation-based performance analysis of server-less Blockchain-enabled Federated Learning

Learn other languages ​​using artificial intelligence with python.

[CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

DeepHawkeye is a library to detect unusual patterns in images using features from pretrained neural networks

A Strong Baseline for Image Semantic Segmentation

Human Pose estimation with TensorFlow framework

Optimize Trading Strategies Using Freqtrade

Collection of generative models in Tensorflow

Sample code from the Neural Networks from Scratch book.

Data and codes for ACL 2021 paper: Towards Emotional Support Dialog Systems

The code for 'Deep Residual Fourier Transformation for Single Image Deblurring'

Learn other languages using artificial intelligence with python.