Dilated Convolution for Semantic Image Segmentation

Last update: Dec 26, 2022

Related tags

Overview

Multi-Scale Context Aggregation by Dilated Convolutions

Introduction

Properties of dilated convolution are discussed in our ICLR 2016 conference paper. This repository contains the network definitions and the trained models. You can use this code together with vanilla Caffe to segment images using the pre-trained models. If you want to train the models yourself, please check out the document for training.

If you are looking for dilation models with state-of-the-art performance and Python implementation, please check out Dilated Residual Networks.

Citing

If you find the code or the models useful, please cite this paper:

@inproceedings{YuKoltun2016,
	author    = {Fisher Yu and Vladlen Koltun},
	title     = {Multi-Scale Context Aggregation by Dilated Convolutions},
	booktitle = {ICLR},
	year      = {2016},
}

License

The code and models are released under the MIT License (refer to the LICENSE file for details).

Installation

Caffe

Install Caffe and its Python interface. Make sure that the Caffe version is newer than commit 08c5df.

Python

The companion Python script is used to demonstrate the network definition and trained weights.

The required Python packages are numba numpy opencv. Python release from Anaconda is recommended.

In the case of using Anaconda

conda install numba numpy opencv

Running Demo

predict.py is the main script to test the pre-trained models on images. The basic usage is

python predict.py <dataset name> <image path>

Given the dataset name, the script will find the pre-trained model and network definition. We currently support models trained from four datasets: pascal_voc, camvid, kitti, cityscapes. The steps of using the code is listed below:

Clone the code from Github

git clone [email protected]:fyu/dilation.git
cd dilation

Download pre-trained network
```
sh pretrained/download_pascal_voc.sh
```

Run pascal voc model on GPU 0

python predict.py pascal_voc images/dog.jpg --gpu 0

Training

You are more than welcome to train our model on a new dataset. To do that, please refer to the document for training.

Implementation of Dilated Convolution

Besides Caffe support, dilated convolution is also implemented in other deep learning packages. For example,

Torch: SpatialDilatedConvolution
Lasagne: DilatedConv2DLayer

Dilated Convolution for Semantic Image Segmentation

Related tags

Overview

Multi-Scale Context Aggregation by Dilated Convolutions

Introduction

Citing

License

Installation

Caffe

Python

Running Demo

Training

Implementation of Dilated Convolution

Owner

Fisher Yu

A modular, research-friendly framework for high-performance and inference of sequence models at many scales

Elastic weight consolidation technique for incremental learning.

Lava-DL, but with PyTorch-Lightning flavour

Code Release for ICCV 2021 (oral), "AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds"

EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction

一个免费开源一键搭建的通用验证码识别平台，大部分常见的中英数验证码识别都没啥问题。

A highly efficient and modular implementation of Gaussian Processes in PyTorch

An investigation project for SISR.

Official implementation for "Low-light Image Enhancement via Breaking Down the Darkness"

Denoising Normalizing Flow

DIR-GNN - Discovering Invariant Rationales for Graph Neural Networks

A custom DeepStack model for detecting 16 human actions.

Introduction to AI assignment 1 HCM University of Technology, term 211

Codes for CVPR2021 paper "PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization"

unet-family: Ultimate version

Code for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021)

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

The implementation of ICASSP 2020 paper "Pixel-level self-paced learning for super-resolution"

Code for approximate graph reduction techniques for cardinality-based DSFM, from paper

Direct design of biquad filter cascades with deep learning by sampling random polynomials.