A pytorch-based real-time segmentation model for autonomous driving

Last update: Dec 22, 2022

Overview

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation

This project contains the Pytorch implementation for the proposed CFPNet: paper

Real-time semantic segmentation is playing a more important role in computer vision, due to the growing demand for mobile devices and autonomous driving. Therefore, it is very important to achieve a good trade-off among performance, model size and inference speed. In this paper, we propose a Channel-wise Feature Pyramid (CFP) module to balance those factors. Based on the CFP module, we built CFPNet for real-time semantic segmentation which applied a series of dilated convolution channels to extract effective features. Experiments on Cityscapes and CamVid datasets show that the proposed CFPNet achieves an effective combination of those factors. For the Cityscapes test dataset, CFPNet achievse 70.1% class-wise mIoU with only 0.55 million parameters and 2.5 MB memory. The inference speed can reach 30 FPS on a single RTX 2080Ti GPU (GPU usage 60%) with a 1024×2048-pixel image.

Installation

Enviroment: Python 3.6; Pytorch 1.0; CUDA 9.0; cuDNN V7
Install some packages:

pip install opencv-python pillow numpy matplotlib

Clone this repository

git clone https://github.com/AngeLouCN/CFPNet

One GPU with 11GB memory is needed

Dataset

You need to download the two dataset——CamVid and Cityscapes, and put the files in the datasetfolder with following structure.

|—— camvid
|    ├── train
|    ├── test
|    ├── val 
|    ├── trainannot
|    ├── testannot
|    ├── valannot
|    ├── camvid_trainval_list.txt
|    ├── camvid_train_list.txt
|    ├── camvid_test_list.txt
|    └── camvid_val_list.txt
├── cityscapes
|    ├── gtCoarse
|    ├── gtFine
|    ├── leftImg8bit
|    ├── cityscapes_trainval_list.txt
|    ├── cityscapes_train_list.txt
|    ├── cityscapes_test_list.txt
|    └── cityscapes_val_list.txt

Training

You can run: python train.py -hto check the detail of optional arguments. In the train.py, you can set the dataset, train type, epochs and batch size, etc.
training on Cityscapes train set.

python train.py --dataset cityscapes

training on Camvid train and val set.

python train.py --dataset camvid --train_type trainval --max_epochs 1000 --lr 1e-3 --batch_size 16

During training course, every 50 epochs, we will record the mean IoU of train set, validation set and training loss to draw a plot, so you can check whether the training process is normal.

Val mIoU vs Epochs	Train loss vs Epochs

Testing

After training, the checkpoint will be saved at checkpointfolder, you can use test.pyto predict the result.

python test.py --dataset ${camvid, cityscapes} --checkpoint ${CHECKPOINT_FILE}

Evalution

For those dataset that do not provide label on the test set (e.g. Cityscapes), you can use predict.py to save all the output images, then submit to official webpage for evaluation.

python test.py --dataset ${camvid, cityscapes} --checkpoint ${CHECKPOINT_FILE}

Inference Speed

You can run the eval_fps.py to test the model inference speed, input the image size such as 1024,2048.

python eval_fps.py 1024,2048

Results

Results for CFPNet-V1, CFPNet-V2 and CFPNet-v3:

Dataset	Model	mIoU
Cityscapes	CFPNet-V1	60.4%
Cityscapes	CFPNet-V2	66.5%
Cityscapes	CFPNet-V3	70.1%

Sample results: (from top to bottom is Original, CFPNet-V1, CFPNet-V2 and CFPNet-v3)

Category_acc vs size	Class_acc vs size

Class_acc vs parameter	Class_acc vs speed

Comparsion

Results of Cityscapes

Results of CamVid

Citation

If you think our work is helpful, please consider to cite:

@article{lou2021cfpnet,
  title={CFPNet: Channel-wise Feature Pyramid for Real-Time Semantic Segmentation},
  author={Lou, Ange and Loew, Murray},
  journal={arXiv preprint arXiv:2103.12212},
  year={2021}
}

A pytorch-based real-time segmentation model for autonomous driving

Related tags

Overview

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation

Installation

Dataset

Training

Testing

Evalution

Inference Speed

Results

Comparsion

Citation

Owner

Reliable probability face embeddings

Turi Create simplifies the development of custom machine learning models.

Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals.

PyTorch implementation of hand mesh reconstruction described in CMR and MobRecon.

Lightweight stereo matching network based on MobileNetV1 and MobileNetV2

Official code repository for "Exploring Neural Models for Query-Focused Summarization"

Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

A curated list of the top 10 computer vision papers in 2021 with video demos, articles, code and paper reference.

End-to-End Referring Video Object Segmentation with Multimodal Transformers

DeepLab2: A TensorFlow Library for Deep Labeling

根据midi文件演奏“风物之诗琴”的脚本 "Windsong Lyre" auto play

一个运行在 𝐞𝐥𝐞𝐜𝐕𝟐𝐏 或 𝐪𝐢𝐧𝐠𝐥𝐨𝐧𝐠 等定时面板的签到项目

PyTorch implementation for the paper Visual Representation Learning with Self-Supervised Attention for Low-Label High-Data Regime

Time should be taken seer-iously

Hidden-Fold Networks (HFN): Random Recurrent Residuals Using Sparse Supermasks

Fast, flexible and easy to use probabilistic modelling in Python.

PyTorch code for our paper "Image Super-Resolution with Non-Local Sparse Attention" (CVPR2021).

Node Editor Plug for Blender

Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch

Public implementation of "Learning from Suboptimal Demonstration via Self-Supervised Reward Regression" from CoRL'21