ICNet and PSPNet-50 in Tensorflow for real-time semantic segmentation

Last update: Nov 21, 2022

Overview

Real-Time Semantic Segmentation in TensorFlow

Perform pixel-wise semantic segmentation on high-resolution images in real-time with Image Cascade Network (ICNet), the highly optimized version of the state-of-the-art Pyramid Scene Parsing Network (PSPNet). This project implements ICNet and PSPNet50 in Tensorflow with training support for Cityscapes.

Download pre-trained ICNet and PSPNet50 models here

Deploy ICNet and preform inference at over 30fps on NVIDIA Titan Xp.

This implementation is based off of the original ICNet paper proposed by Hengshuang Zhao titled ICNet for Real-Time Semantic Segmentation on High-Resolution Images. Some ideas were also taken from their previous PSPNet paper, Pyramid Scene Parsing Network. The network compression implemented is based on the paper Pruning Filters for Efficient ConvNets.

Release information

October 14, 2018

An ICNet model trained in August, 2018 has been released as a pre-trained model in the Model Zoo. All the models were trained without coarse labels and are evaluated on the validation set.

September 22, 2018

The baseline PSPNet50 pre-trained model files have been released publically in the Model Zoo. The accuracy of the model surpases that referenced in the ICNet paper.

August 12, 2018

Initial release. Project includes scripts for training ICNet, evaluating ICNet and compressing ICNet from ResNet50 weights. Also includes scripts for training PSPNet and evaluating PSPNet as a baseline.

Documentation

Model Depot Inference Tutorials

Overview

ICNet model in Tensorboard.

Training ICNet from Classification Weights

This project has implemented the ICNet training process, allowing you to train your own model directly from ResNet50 weights as is done in the original work. Other available implementations simply convert the Caffe model to Tensorflow, only allowing for fine-tuning from weights trained on Cityscapes.

By training ICNet on weights initialized from ImageNet, you have more flexibility in the transfer learning process. Read more about setting up this process can be found here. For training ICNet, follow the guide here.

ICNet Network Compression

In order to achieve real-time speeds, ICNet uses a form of network compression called filter pruning. This drastically reduces the complexity of the model by removing filters from convolutional layers in the network. This project has also implemented this ICNet compression process directly in Tensorflow.

The compression is working, however which "compression scheme" to use is still somewhat ambiguous when reading the original ICNet paper. This is still a work in progress.

PSPNet Baseline Implementation

In order to also reproduce the baselines used in the original ICNet paper, you will also find implementations and pre-trained models for PSPNet50. Since ICNet can be thought of as a modified PSPNet, it can be useful for comparison purposes.

Informtion on training or using the baseline PSPNet50 model can be found here.

Maintainers

Oles Andrienko, github: oandrienko

If you found the project, documentation and the provided pretrained models useful in your work, consider citing it with

@misc{fastsemseg2018,
  author={Andrienko, Oles},
  title={Fast Semantic Segmentation},
  howpublished={\url{https://github.com/oandrienko/fast-semantic-segmentation}},
  year={2018}
}

Related Work

This project and some of the documentation was based on the Tensorflow Object Detection API. It was the initial inspiration for this project. The third_party directory of this project contains files from OpenAI's Gradient Checkpointing project by Tim Salimans and Yaroslav Bulatov. The helper modules found in third_party/model_deploy.py are from the Tensorflow Slim project. Finally, another open source ICNet implementation which converts the original Caffe network weights to Tensorflow was used as a reference. Find all these projects below:

Thanks

This project could not have happened without the advice (and GPU access) given by Professor Steven Waslander and Ali Harakeh from the Waterloo Autonomous Vehicles Lab (now the Toronto Robotics and Artificial Intelligence Lab).

ICNet and PSPNet-50 in Tensorflow for real-time semantic segmentation

Related tags

Overview

Real-Time Semantic Segmentation in TensorFlow

Release information

October 14, 2018

September 22, 2018

August 12, 2018

Documentation

Model Depot Inference Tutorials

Overview

Training ICNet from Classification Weights

ICNet Network Compression

PSPNet Baseline Implementation

Maintainers

Related Work

Thanks

Owner

Oles Andrienko

DenseNet Implementation in Keras with ImageNet Pretrained Models

DLFlow is a deep learning framework.

An open source Python package for plasma science that is under development

A Simple and Versatile Framework for Object Detection and Instance Recognition

Simple node deletion tool for onnx.

Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"

Kaggle-titanic - A tutorial for Kaggle's Titanic: Machine Learning from Disaster competition. Demonstrates basic data munging, analysis, and visualization techniques. Shows examples of supervised machine learning techniques.

Official code for 'Robust Siamese Object Tracking for Unmanned Aerial Manipulator' and offical introduction to UAMT100 benchmark

[ICLR 2021] Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization

PyTorch experiments with the Zalando fashion-mnist dataset

Self-Supervised Pillar Motion Learning for Autonomous Driving (CVPR 2021)

[NeurIPS 2021] Garment4D: Garment Reconstruction from Point Cloud Sequences

Code and models for "Pano3D: A Holistic Benchmark and a Solid Baseline for 360 Depth Estimation", OmniCV Workshop @ CVPR21.

Official implementation of Deep Convolutional Dictionary Learning for Image Denoising.

TraSw for FairMOT - A Single-Target Attack example (Attack ID: 19; Screener ID: 24):

Baseline of DCASE 2020 task 4

Cooperative multi-agent reinforcement learning for high-dimensional nonequilibrium control

Official implementation of Rethinking Graph Neural Architecture Search from Message-passing (CVPR2021)

The Power of Scale for Parameter-Efficient Prompt Tuning

Source Code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chinese Question Matching