Image Segmentation and Object Detection in Pytorch

Last update: Dec 10, 2022

Overview

Image Segmentation and Object Detection in Pytorch

Pytorch-Segmentation-Detection is a library for image segmentation and object detection with reported results achieved on common image segmentation/object detection datasets, pretrained models and scripts to reproduce them.

Segmentation

PASCAL VOC 2012

Implemented models were tested on Restricted PASCAL VOC 2012 Validation dataset (RV-VOC12) or Full PASCAL VOC 2012 Validation dataset (VOC-2012) and trained on the PASCAL VOC 2012 Training data and additional Berkeley segmentation data for PASCAL VOC 12.

You can find all the scripts that were used for training and evaluation here.

This code has been used to train networks with this performance:

Model	Test data	Mean IOU	Mean pix. accuracy	Pixel accuracy	Inference time (512x512 px. image)	Model Download Link	Related paper
Resnet-18-8s	RV-VOC12	59.0	in prog.	in prog.	28 ms.	Dropbox	DeepLab
Resnet-34-8s	RV-VOC12	68.0	in prog.	in prog.	50 ms.	Dropbox	DeepLab
Resnet-50-16s	VOC12	66.5	in prog.	in prog.	in prog.	in prog.	DeepLab
Resnet-50-8s	VOC12	67.0	in prog.	in prog.	in prog.	in prog.	DeepLab
Resnet-50-8s-deep-sup	VOC12	67.1	in prog.	in prog.	in prog.	in prog.	DeepLab
Resnet-101-16s	VOC12	68.6	in prog.	in prog.	in prog.	in prog.	DeepLab
PSP-Resnet-18-8s	VOC12	68.3	n/a	n/a	n/a	in prog.	PSPnet
PSP-Resnet-50-8s	VOC12	73.6	n/a	n/a	n/a	in prog.	PSPnet

Some qualitative results:

Endovis 2017

Implemented models were trained on Endovis 2017 segmentation dataset and the sequence number 3 was used for validation and was not included in training dataset.

The code to acquire the training and validating the model is also provided in the library.

Additional Qualitative results can be found on this youtube playlist.

Binary Segmentation

Model	Test data	Mean IOU	Mean pix. accuracy	Pixel accuracy	Inference time (512x512 px. image)	Model Download Link
Resnet-9-8s	Seq # 3 *	96.1	in prog.	in prog.	13.3 ms.	Dropbox
Resnet-18-8s	Seq # 3	96.0	in prog.	in prog.	28 ms.	Dropbox
Resnet-34-8s	Seq # 3	in prog.	in prog.	in prog.	50 ms.	in prog.

Resnet-9-8s network was tested on the 0.5 reduced resoulution (512 x 640).

Qualitative results (on validation sequence):

Multi-class Segmentation

Model	Test data	Mean IOU	Mean pix. accuracy	Pixel accuracy	Inference time (512x512 px. image)	Model Download Link
Resnet-18-8s	Seq # 3	81.0	in prog.	in prog.	28 ms.	Dropbox
Resnet-34-8s	Seq # 3	in prog.	in prog.	in prog.	50 ms.	in prog

Qualitative results (on validation sequence):

Cityscapes

The dataset contains video sequences recorded in street scenes from 50 different cities, with high quality pixel-level annotations of 5 000 frames. The annotations contain 19 classes which represent cars, road, traffic signs and so on.

Model	Test data	Mean IOU	Mean pix. accuracy	Pixel accuracy	Inference time (512x512 px. image)	Model Download Link
Resnet-18-32s	Validation set	61.0	in prog.	in prog.	in prog.	in prog.
Resnet-18-8s	Validation set	60.0	in prog.	in prog.	28 ms.	Dropbox
Resnet-34-8s	Validation set	69.1	in prog.	in prog.	50 ms.	Dropbox
Resnet-50-16s-PSP	Validation set	71.2	in prog.	in prog.	in prog.	in prog.

Qualitative results (on validation sequence):

Whole sequence can be viewed here.

Installation

This code requires:

Pytorch.
Some libraries which can be acquired by installing Anaconda package.

Or you can install scikit-image, matplotlib, numpy using pip.
Clone the library:

git clone --recursive https://github.com/warmspringwinds/pytorch-segmentation-detection

And use this code snippet before you start to use the library:

import sys
# update with your path
# All the jupyter notebooks in the repository already have this
sys.path.append("/your/path/pytorch-segmentation-detection/")
sys.path.insert(0, '/your/path/pytorch-segmentation-detection/vision/')

Here we use our pytorch/vision fork, which might be merged and futher merged in a future. We have added it as a submodule to our repository.

Download segmentation or detection models that you want to use manually (links can be found below).

About

If you used the code for your research, please, cite the paper:

@article{pakhomov2017deep,
  title={Deep Residual Learning for Instrument Segmentation in Robotic Surgery},
  author={Pakhomov, Daniil and Premachandran, Vittal and Allan, Max and Azizian, Mahdi and Navab, Nassir},
  journal={arXiv preprint arXiv:1703.08580},
  year={2017}
}

During implementation, some preliminary experiments and notes were reported:

Image Segmentation and Object Detection in Pytorch

Related tags

Overview

Image Segmentation and Object Detection in Pytorch

Segmentation

PASCAL VOC 2012

Endovis 2017

Binary Segmentation

Multi-class Segmentation

Cityscapes

Installation

About

Owner

Daniil Pakhomov

[MICCAI'20] AlignShift: Bridging the Gap of Imaging Thickness in 3D Anisotropic Volumes

ISNAS-DIP: Image Specific Neural Architecture Search for Deep Image Prior [CVPR 2022]

TC-GNN with Pytorch integration

Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

Combinatorially Hard Games where the levels are procedurally generated

This script runs neural style transfer against the provided content image.

One line to host them all. Bootstrap your image search case in minutes.

Learning Compatible Embeddings, ICCV 2021

Gems & Holiday Package Prediction

A PyTorch implementation of "CoAtNet: Marrying Convolution and Attention for All Data Sizes".

Self-training for Few-shot Transfer Across Extreme Task Differences

Tensorflow AffordanceNet and AffContext implementations

alfred-py: A deep learning utility library for human

Probabilistic Gradient Boosting Machines

Session-aware Item-combination Recommendation with Transformer Network

NeuPy is a Tensorflow based python library for prototyping and building neural networks

Code to reproduce the results in "Visually Grounded Reasoning across Languages and Cultures", EMNLP 2021.

Code, environments, and scripts for the paper: "How Private Is Your RL Policy? An Inverse RL Based Analysis Framework"

Frigate - NVR With Realtime Object Detection for IP Cameras

Facial Expression Detection In The Realtime

Image Segmentation and Object Detection in Pytorch

Related tags

Overview

Image Segmentation and Object Detection in Pytorch

Segmentation

PASCAL VOC 2012

Endovis 2017

Binary Segmentation

Multi-class Segmentation

Cityscapes

Installation

About

Owner

Daniil Pakhomov

[MICCAI'20] AlignShift: Bridging the Gap of Imaging Thickness in 3D Anisotropic Volumes

ISNAS-DIP: Image Specific Neural Architecture Search for Deep Image Prior [CVPR 2022]

TC-GNN with Pytorch integration

Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

Combinatorially Hard Games where the levels are procedurally generated

This script runs neural style transfer against the provided content image.

One line to host them all. Bootstrap your image search case in minutes.

Learning Compatible Embeddings, ICCV 2021

Gems & Holiday Package Prediction

A PyTorch implementation of "CoAtNet: Marrying Convolution and Attention for All Data Sizes".

Self-training for Few-shot Transfer Across Extreme Task Differences

Tensorflow AffordanceNet and AffContext implementations

alfred-py: A deep learning utility library for **human**

Probabilistic Gradient Boosting Machines

Session-aware Item-combination Recommendation with Transformer Network

NeuPy is a Tensorflow based python library for prototyping and building neural networks

Code to reproduce the results in "Visually Grounded Reasoning across Languages and Cultures", EMNLP 2021.

Code, environments, and scripts for the paper: "How Private Is Your RL Policy? An Inverse RL Based Analysis Framework"

Frigate - NVR With Realtime Object Detection for IP Cameras

Facial Expression Detection In The Realtime

alfred-py: A deep learning utility library for human