This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset.

Overview

DeepLab-ResNet-TensorFlow

This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset.

Updates

29 Jan, 2017:

  • Fixed the implementation of the batch normalisation layer: it now supports both the training and inference steps. If the flag --is-training is provided, the running means and variances will be updated; otherwise, they will be kept intact. The .ckpt files have been updated accordingly - to download please refer to the new link provided below.
  • Image summaries during the training process can now be seen using TensorBoard.
  • Fixed the evaluation procedure: the 'void' label (255) is now correctly ignored. As a result, the performance score on the validation set has increased to 80.1%.

Model Description

The DeepLab-ResNet is built on a fully convolutional variant of ResNet-101 with atrous (dilated) convolutions, atrous spatial pyramid pooling, and multi-scale inputs (not implemented here).

The model is trained on a mini-batch of images and corresponding ground truth masks with the softmax classifier at the top. During training, the masks are downsampled to match the size of the output from the network; during inference, to acquire the output of the same size as the input, bilinear upsampling is applied. The final segmentation mask is computed using argmax over the logits. Optionally, a fully-connected probabilistic graphical model, namely, CRF, can be applied to refine the final predictions. On the test set of PASCAL VOC, the model achieves 79.7% of mean intersection-over-union.

For more details on the underlying model please refer to the following paper:

@article{CP2016Deeplab,
  title={DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs},
  author={Liang-Chieh Chen and George Papandreou and Iasonas Kokkinos and Kevin Murphy and Alan L Yuille},
  journal={arXiv:1606.00915},
  year={2016}
}

Requirements

TensorFlow needs to be installed before running the scripts. TensorFlow 0.12 is supported; for TensorFlow 0.11 please refer to this branch.

To install the required python packages (except TensorFlow), run

pip install -r requirements.txt

or for a local installation

pip install -user -r requirements.txt

Caffe to TensorFlow conversion

To imitate the structure of the model, we have used .caffemodel files provided by the authors. The conversion has been performed using Caffe to TensorFlow with an additional configuration for atrous convolution and batch normalisation (since the batch normalisation provided by Caffe-tensorflow only supports inference). There is no need to perform the conversion yourself as you can download the already converted models - deeplab_resnet.ckpt (pre-trained) and deeplab_resnet_init.ckpt (the last layers are randomly initialised) - here.

Nevertheless, it is easy to perform the conversion manually, given that the appropriate .caffemodel file has been downloaded, and Caffe to TensorFlow dependencies have been installed. The Caffe model definition is provided in misc/deploy.prototxt. To extract weights from .caffemodel, run the following:

python convert.py /path/to/deploy/prototxt --caffemodel /path/to/caffemodel --data-output-path /where/to/save/numpy/weights

As a result of running the command above, the model weights will be stored in /where/to/save/numpy/weights. To convert them to the native TensorFlow format (.ckpt), simply execute:

python npy2ckpt.py /where/to/save/numpy/weights --save-dir=/where/to/save/ckpt/weights

Dataset and Training

To train the network, one can use the augmented PASCAL VOC 2012 dataset with 10582 images for training and 1449 images for validation.

The training script allows to monitor the progress in the optimisation process using TensorBoard's image summary. Besides that, one can also exploit random scaling of the inputs during training as a means for data augmentation. For example, to train the model from scratch with random scale turned on, simply run:

python train.py --random-scale

To see the documentation on each of the training settings run the following:

python train.py --help

An additional script, fine_tune.py, demonstrates how to train only the last layers of the network.

Evaluation

The single-scale model shows 80.1% mIoU on the Pascal VOC 2012 validation dataset. No post-processing step with CRF is applied.

The following command provides the description of each of the evaluation settings:

python evaluate.py --help

Inference

To perform inference over your own images, use the following command:

python inference.py /path/to/your/image /path/to/ckpt/file

This will run the forward pass and save the resulted mask with this colour map:

Missing features

At the moment, the post-processing step with CRF is not implemented. Besides that, multi-scale inputs are missing, as well. No weight regularisation is applied.

Other implementations

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis

Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis. You write a high level configuration file specifying your in

Blue Collar Bioinformatics 917 Jan 03, 2023
GMFlow: Learning Optical Flow via Global Matching

GMFlow GMFlow: Learning Optical Flow via Global Matching Authors: Haofei Xu, Jing Zhang, Jianfei Cai, Hamid Rezatofighi, Dacheng Tao We streamline the

Haofei Xu 298 Jan 04, 2023
OpenMMLab Computer Vision Foundation

English | ็ฎ€ไฝ“ไธญๆ–‡ Introduction MMCV is a foundational library for computer vision research and supports many research projects as below: MMCV: OpenMMLab

OpenMMLab 4.6k Jan 09, 2023
A general python framework for visual object tracking and video object segmentation, based on PyTorch

PyTracking A general python framework for visual object tracking and video object segmentation, based on PyTorch. ๐Ÿ“ฃ Two tracking/VOS papers accepted

2.6k Jan 04, 2023
The code from the paper Character Transformations for Non-Autoregressive GEC Tagging

Character Transformations for Non-Autoregressive GEC Tagging Milan Straka, Jakub Nรกplava, Jana Strakovรก Charles University Faculty of Mathematics and

รšFAL 5 Dec 10, 2022
Multi-label Co-regularization for Semi-supervised Facial Action Unit Recognition (NeurIPS 2019)

MLCR This is the source code for paper Multi-label Co-regularization for Semi-supervised Facial Action Unit Recognition. Xuesong Niu, Hu Han, Shiguang

Edson-Niu 60 Nov 29, 2022
This repository contains the source codes for the paper AtlasNet V2 - Learning Elementary Structures.

AtlasNet V2 - Learning Elementary Structures This work was build upon Thibault Groueix's AtlasNet and 3D-CODED projects. (you might want to have a loo

Thรฉo Deprelle 123 Nov 11, 2022
This repo in the implementation of EMNLP'21 paper "SPARQLing Database Queries from Intermediate Question Decompositions" by Irina Saparina, Anton Osokin

SPARQLing Database Queries from Intermediate Question Decompositions This repo is the implementation of the following paper: SPARQLing Database Querie

Yandex Research 20 Dec 19, 2022
A simple, high level, easy-to-use open source Computer Vision library for Python.

ZoomVision : Slicing Aid Detection A simple, high level, easy-to-use open source Computer Vision library for Python. Installation Installing dependenc

Nurettin SinanoฤŸlu 2 Mar 04, 2022
Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)

TorchCAM: class activation explorer Simple way to leverage the class-specific activation of convolutional layers in PyTorch. Quick Tour Setting your C

F-G Fernandez 1.2k Dec 29, 2022
Application of the L2HMC algorithm to simulations in lattice QCD.

l2hmc-qcd ๐Ÿ“Š Slides Recent talk on Training Topological Samplers for Lattice Gauge Theory from the Machine Learning for High Energy Physics, on and of

Sam Foreman 37 Dec 14, 2022
3rd Place Solution for ICCV 2021 Workshop SSLAD Track 3A - Continual Learning Classification Challenge

Online Continual Learning via Multiple Deep Metric Learning and Uncertainty-guided Episodic Memory Replay 3rd Place Solution for ICCV 2021 Workshop SS

Rifki Kurniawan 6 Nov 10, 2022
ไธ€ไธช่ฟ่กŒๅœจ ๐ž๐ฅ๐ž๐œ๐•๐Ÿ๐ ๆˆ– ๐ช๐ข๐ง๐ ๐ฅ๐จ๐ง๐  ็ญ‰ๅฎšๆ—ถ้ขๆฟ็š„็ญพๅˆฐ้กน็›ฎ

ๅฎšๆ—ถ้ขๆฟไธŠ็š„็ญพๅˆฐ็›’ ไธ€ไธช่ฟ่กŒๅœจ ๐ž๐ฅ๐ž๐œ๐•๐Ÿ๐ ๆˆ– ๐ช๐ข๐ง๐ ๐ฅ๐จ๐ง๐  ็ญ‰ๅฎšๆ—ถ้ขๆฟ็š„็ญพๅˆฐ้กน็›ฎ ๐ž๐ฅ๐ž๐œ๐•๐Ÿ๐ ๐ช๐ข๐ง๐ ๐ฅ๐จ๐ง๐  ็‰นๅˆซๅฃฐๆ˜Ž ๆœฌไป“ๅบ“ๅ‘ๅธƒ็š„่„šๆœฌๅŠๅ…ถไธญๆถ‰ๅŠ็š„ไปปไฝ•่งฃ้”ๅ’Œ่งฃๅฏ†ๅˆ†ๆž่„šๆœฌ๏ผŒไป…็”จไบŽๆต‹่ฏ•ๅ’Œๅญฆไน ็ ”็ฉถ๏ผŒ็ฆๆญข็”จไบŽๅ•†ไธš็”จ้€”๏ผŒไธ่ƒฝไฟ่ฏๅ…ถๅˆ

Leon 1.1k Dec 30, 2022
Official implementation of the paper "Steganographer Detection via a Similarity Accumulation Graph Convolutional Network"

SAGCN - Official PyTorch Implementation | Paper | Project Page This is the official implementation of the paper "Steganographer detection via a simila

ZHANG Zhi 1 Nov 26, 2021
PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

PyExplainer PyExplainer is a local rule-based model-agnostic technique for generating explanations (i.e., why a commit is predicted as defective) of J

AI Wizards for Software Management (AWSM) Research Group 14 Nov 13, 2022
scikit-learn: machine learning in Python

scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license. The project was started

scikit-learn 52.5k Jan 08, 2023
Replication Package for AequeVox:Automated Fariness Testing for Speech Recognition Systems

AequeVox Replication Package for AequeVox:Automated Fariness Testing for Speech Recognition Systems README under development. Python Packages Required

Sai Sathiesh 2 Aug 28, 2022
Multiple style transfer via variational autoencoder

ST-VAE Multiple style transfer via variational autoencoder By Zhi-Song Liu, Vicky Kalogeiton and Marie-Paule Cani This repo only provides simple testi

13 Oct 29, 2022
InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing Figure: High-quality facial attributes editing results with InterFaceGA

GenForce: May Generative Force Be with You 1.3k Jan 09, 2023
CMP 414/765 course repository for Spring 2022 semester

CMP414/765: Artificial Intelligence Spring2021 This is the GitHub repository for course CMP 414/765: Artificial Intelligence taught at The City Univer

ch00226855 4 May 16, 2022