DilatedNet in Keras for image segmentation

Last update: Mar 15, 2022

Overview

Keras implementation of DilatedNet for semantic segmentation

A native Keras implementation of semantic segmentation according to Multi-Scale Context Aggregation by Dilated Convolutions (2016). Optionally uses the pretrained weights by the authors'.

The code has been tested on Tensorflow 1.3, Keras 1.2, and Python 3.6.

Using the pretrained model

Download and extract the pretrained model:

curl -L https://github.com/nicolov/segmentation_keras/releases/download/model/nicolov_segmentation_model.tar.gz | tar xvf -

Install dependencies and run:

pip install -r requirements.txt
# For GPU support
pip install tensorflow-gpu==1.3.0

python predict.py --weights_path conversion/converted/dilation8_pascal_voc.npy

The output image will be under images/cat_seg.png.

Converting the original Caffe model

Follow the instructions in the conversion folder to convert the weights to the TensorFlow format that can be used by Keras.

Training

Download the Augmented Pascal VOC dataset here:

curl -L http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz | tar -xvf -

This will create a benchmark_RELEASE directory in the root of the repo. Use the convert_masks.py script to convert the provided masks in .mat format to RGB pngs:

python convert_masks.py \
    --in-dir benchmark_RELEASE/dataset/cls \
    --out-dir benchmark_RELEASE/dataset/pngs

Start training:

python train.py --batch-size 2

Model checkpoints are saved under trained/, and can be used with the predict.py script for testing.

The training code is currently limited to the frontend module, and thus only outputs 16x16 segmentation maps. The augmentation pipeline does mirroring but not cropping or rotation.

Fisher Yu and Vladlen Koltun, Multi-Scale Context Aggregation by Dilated Convolutions, 2016

Comments

training and validation loss nan

First of all I just want to thank you for the great work. I am having an issue during training, my loss and val_loss is nan, however I am still getting values for accuracy and val_acc. I am training on the PASCAL_VOC 2012 dataset with the segmentation class pngs.

keras 1.2.1 & 2.0.6 tensorflow-gpu 1.2.1 python 3.6.1

opened by Barfknecht 9
Fine tuning ...

Hello,

You have provided the pre-trained model of VOC. I have a small dataset with 2 classes, which I annotated based on VOC and I want to fine-tune it. Would you please guide me through the process?

opened by MyVanitar 8
Modifying number of class

Hi Nicolov,

Thanks for the great work! I tried to train new dataset by generating my own set of jpg and png masks. However I realized it only works for pre-defined 20 classes. For example I wanted to re-train this network to segment screws from background, I wasn't able to find way to add new classes but to use a existed color 0x181818 which was originally trained for cats. After training it did segmented the screw. However I'm still wondering is there any way to change the number of classes and specify which color value are associated with certain class?

opened by francisbitontistudio 7
Black image after segmentation

Hi! I have val accuracy = 1, but when i am trying to predict mask on the image from train set it displays me black image. Does anybody know what is the reason of this behaviour?

opened by dimaxano 7

docker running error

Hi, @nicolov ,

For the caffe weight conversion, I got the following error:

(tf_1.0) [email protected]:/data/code/segmentation_keras/conversion# docker run -v $(pwd):/workspace -ti `docker build -q .`
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
"docker run" requires at least 1 argument(s).
See 'docker run --help'.

Usage:  docker run [OPTIONS] IMAGE [COMMAND] [ARG...]

Run a command in a new container
(tf_1.0) ro[email protected]:/data/code/segmentation_keras/conversion#

It shows that docker daemon is not running. Any other command should I input before it?

Thanks

opened by amiltonwong 7

the way of loading the weight

Hi nicolov,

In the post, you explained how to do the weight conversion. Due to the development environment constraints, it is a little bit hard for me to follow exactly your step.

In keras blog, author also show a way to load VGG16 weight from Keras directly. Do you think this weight can be used for your implementation? Do we have to use the converted caffe model weight for pascal_voc. The data set I will be using is of different domain with the data set published in the paper. Thanks for your advice.

opened by wenouyang 5
Problems with CuDNN library

While running train.py, this is the error message: Epoch 1/20 E tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 6021 (compatibility version 6000) but source was compiled with 5105 (compatibility version 5100). If using a binary install, upgrade your CuDNN library to match. If building from sources, make sure the library loaded at runtime matches a compatible version specified during compile configuration.

Since I don't have the root account, I can't install CuDNN v5. Do you know how I can fix this? Thanks!

opened by Yuren-Zhong 4
IoU results

Have you by any chance compared this to the original implementation with regards to the mean IoU? If so, what implementation of IoU did you use and what were your results?

opened by Barfknecht 4
about the required pre-trained vgg model

Hi, @nicolov ,

According to this line, vgg_conv.npy is needed as pre-trained vgg model in training. Could you list the download location for corresponding caffemodel and prototxt file? And, is the conversion step the same as here?

Thanks!

opened by amiltonwong 4
regarding loading_weights

Hi nicolov,

In the train.py, you have included the function of load_weights(model, weights_path):. My understanding is that you are trying to load a pre-training vcg model. If I do not want to use this pretrained model because the problem I am working one may belong to a totally different domain, should I just skip calling this load_weights function? Or using a pre-trained model is always preferable, I am kind of confusing about this.

In the notes, you mentioned that The training code is currently limited to the frontend module, and thus only outputs 16x16 segmentation maps. If I would like to leverage this code for my own data set, what are the modifications that I have to make? Do I still have to load the weights?

Thank you very much!

opened by wenouyang 4

Cannot locate Dockerfile: Dockerfile

Probably a rookie error but when I am trying to run the conversion step in conversion by running the docker I get the following error:

$sudo docker run -v $(pwd):/workspace -ti `docker build -q .`
time="2017-02-09T09:15:11-08:00" level=fatal msg="Cannot locate Dockerfile: Dockerfile" 
docker: "run" requires a minimum of 1 argument. See 'docker run --help'.

opened by mongoose54 4

Training freezes

On executing command: python train.py --batch-size 2 ,training freezes at last step of first epoch.

All the libraries are according to the requirement.txt file

opened by ghost 1
AtrousConvolution2D vs.Conv2DTranspose

Hi @nicolov I was wondering whether in your model, you wouldn't need to have a Conv2DTranspose or Upsample layer to compensate for the maxpool and obtain predictions with the same size as your input image?

opened by tinalegre 0
How to handle high resolution images
Hello @nicolov ,

let me first express my appreciation to your work in image segmentation its great (Y)

small suggestion , i just want to notify you that there is a missing -- in input parsing . very minor change

parser.add_argument('--input_path', nargs='?', default='images/cat.jpg', help='Required path to input image')

I'm hoping you can help me in understanding how to handle high res images as 1028 and 4k ,

also in the code i found you set input_width, input_height = 900, 900 and label_margin = 186 can you please illustrate what is the reason for this static number and how they effect on the output high and width

output_height = input_height - 2 * label_margin output_width = input_width - 2 * label_margin
opened by engahmed1190 2
Context module training implementation plans

Thanks for creating this implementation. Do you have any plans to implement training of the context module (to allow producing full resolution segmentation maps)?

opened by OliverColeman 3
palette conversion not needed
https://github.com/nicolov/segmentation_keras/blob/master/convert_masks.py isn't necessary.

Just use Pillow and you can load the classes separately from the color palette, which means it will already be in the format you want!

from https://github.com/aurora95/Keras-FCN/blob/master/utils/SegDataGenerator.py#L203

from PIL import Image label = Image.open(label_filepath) if self.save_to_dir and self.palette is None: self.palette = label.palette

cool right?
opened by ahundt 6

Releases(caffemodel)

caffemodel(Jan 5, 2019)

Downloaded from http://dl.yf.io/dilation/models/dilation8_pascal_voc.caffemodel
Source code(tar.gz)
Source code(zip)
dilation8_pascal_voc.caffemodel(538.44 MB)
model(Jun 26, 2017)

TensorFlow converted model.
Source code(tar.gz)
Source code(zip)
nicolov_segmentation_model.tar.gz(538.45 MB)

Owner

GitHub Repository

Codecov coverage standard for Python

Python-Standard Last Updated: 01/07/22 00:09:25 What is this? This is a Python application, with basic unit tests, for which coverage is uploaded to C

10 Nov 04, 2022

Demonstrates iterative FGSM on Apple's NeuralHash model.

apple-neuralhash-attack Demonstrates iterative FGSM on Apple's NeuralHash model. TL;DR: It is possible to apply noise to CSAM images and make them loo

11 Jun 23, 2022

Reimplementation of the paper `Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words? (ACL2020)`

Human Attention for Text Classification Re-implementation of the paper Human Attention Maps for Text Classification: Do Humans and Neural Networks Foc

15 Dec 13, 2021

Test-Time Personalization with a Transformer for Human Pose Estimation, NeurIPS 2021

Transforming Self-Supervision in Test Time for Personalizing Human Pose Estimation This is an official implementation of the NeurIPS 2021 paper: Trans

41 Nov 28, 2022

Python wrapper of LSODA (solving ODEs) which can be called from within numba functions.

numbalsoda numbalsoda is a python wrapper to the LSODA method in ODEPACK, which is for solving ordinary differential equation initial value problems.

52 Jan 09, 2023

Code for paper Decoupled Dynamic Spatial-Temporal Graph Neural Network for Traffic Forecasting

Decoupled Spatial-Temporal Graph Neural Networks Code for our paper: Decoupled Dynamic Spatial-Temporal Graph Neural Network for Traffic Forecasting.

43 Jan 04, 2023

Python lib to talk to pylontech lithium batteries (US2000, US3000, ...) using RS485

python-pylontech Python lib to talk to pylontech lithium batteries (US2000, US3000, ...) using RS485 What is this lib ? This lib is meant to talk to P

26 Dec 28, 2022

Source code for our Paper "Learning in High-Dimensional Feature Spaces Using ANOVA-Based Matrix-Vector Multiplication"

NFFT4ANOVA Source code for our Paper "Learning in High-Dimensional Feature Spaces Using ANOVA-Based Matrix-Vector Multiplication" This package uses th

1 Aug 10, 2022

Scikit-event-correlation - Event Correlation and Forecasting over High Dimensional Streaming Sensor Data algorithms

scikit-event-correlation Event Correlation and Changing Detection Algorithm Theo

5 Oct 30, 2022

2021-AIAC-QQ-Browser-Hyperparameter-Optimization-Rank6

8 Mar 31, 2022

Code accompanying the paper "ProxyFL: Decentralized Federated Learning through Proxy Model Sharing"

ProxyFL Code accompanying the paper "ProxyFL: Decentralized Federated Learning through Proxy Model Sharing" Authors: Shivam Kalra*, Junfeng Wen*, Jess

14 Dec 06, 2022

Implementation of "RaScaNet: Learning Tiny Models by Raster-Scanning Image" from CVPR 2021.

RaScaNet: Learning Tiny Models by Raster-Scanning Images Deploying deep convolutional neural networks on ultra-low power systems is challenging, becau

5 Dec 26, 2022

DIVeR: Deterministic Integration for Volume Rendering

DIVeR: Deterministic Integration for Volume Rendering This repo contains the training and evaluation code for DIVeR. Setup python 3.8 pytorch 1.9.0 py

64 Dec 27, 2022

MonoRCNN is a monocular 3D object detection method for automonous driving

MonoRCNN MonoRCNN is a monocular 3D object detection method for automonous driving, published at ICCV 2021. This project is an implementation of MonoR

87 Dec 27, 2022

Official code of paper "PGT: A Progressive Method for Training Models on Long Videos" on CVPR2021

PGT Code for paper PGT: A Progressive Method for Training Models on Long Videos. Install Run pip install -r requirements.txt. Run python setup.py buil

27 Mar 30, 2022

VQGAN+CLIP Colab Notebook with user-friendly interface.

VQGAN+CLIP and other image generation system VQGAN+CLIP Colab Notebook with user-friendly interface. Latest Notebook: Mse regulized zquantize Notebook

227 Jan 05, 2023

A model which classifies reviews as positive or negative.

SentiMent Analysis In this project I built a model to classify movie reviews fromn the IMDB dataset of 50K reviews. WordtoVec : Neural networks only w

2 Feb 09, 2022

Simple Python application to transform Serial data into OSC messages

SerialToOSC-Bridge Simple Python application to transform Serial data into OSC messages. The current purpose is to be a compatibility layer between ha

3 Jun 03, 2021

PyTorch implementation of our Adam-NSCL algorithm from our CVPR2021 (oral) paper "Training Networks in Null Space for Continual Learning"

Adam-NSCL This is a PyTorch implementation of Adam-NSCL algorithm for continual learning from our CVPR2021 (oral) paper: Title: Training Networks in N

34 Dec 21, 2022

PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

VoiceLoop PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop. VoiceLoop is a n

873 Dec 15, 2022