Pixel-wise segmentation on VOC2012 dataset using pytorch.

Last update: Dec 30, 2022

Overview

PiWiSe

Pixel-wise segmentation on the VOC2012 dataset using pytorch.

For a more complete implementation of segmentation networks checkout semseg.

Note:

FCN differs from original implementation see this issue
SegNet does not match original paper performance see here
PSPNet misses "atrous convolution" (conv layers of ResNet101 should be amended to preserve image size)

Keeping this in mind feel free to PR. Thank you!

Setup

See dataset examples here.

Download

Download image archive and extract and do:

mkdir data
mv VOCdevkit/VOC2012/JPEGImages data/images
mv VOCdevkit/VOC2012/SegmentationClass data/classes
rm -rf VOCdevkit

Install

We recommend using pyenv:

pyenv virtualenv 3.6.0 piwise
pyenv activate piwise

then install requirements with pip install -r requirements.txt.

Usage

For latest documentation use:

python main.py --help

Supported model parameters are fcn8, fcn16, fcn32, unet, segnet1, segnet2, pspnet.

Training

If you want to have visualization open an extra tab with:

python -m visdom.server -port 5000

Train the SegNet model 30 epochs with cuda support, visualization and checkpoints every 100 steps:

python main.py --cuda --model segnet2 train --datadir data \
    --num-epochs 30 --num-workers 4 --batch-size 4 \
    --steps-plot 50 --steps-save 100

Evaluation

Then we want to do semantic segmentation on foo.jpg:

python main.py --model segnet2 --state segnet2-30-0 eval foo.jpg foo.png

The segmented class image can now be found at foo.png.

Results

These are some results based on segnet after 40 epoches. Set

loss_weights[0] = 1 / 1

to deal gracefully with the unbalanced problem.

Input	Output	Ground Truth

Pixel-wise segmentation on VOC2012 dataset using pytorch.

Related tags

Overview

PiWiSe

Setup

Download

Install

Usage

Training

Evaluation

Results

Owner

Bodo Kaiser

Most popular metrics used to evaluate object detection algorithms.

Python scripts for performing road segemtnation and car detection using the HybridNets multitask model in ONNX.

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Black-Box-Tuning - Black-Box Tuning for Language-Model-as-a-Service

Making self-supervised learning work on molecules by using their 3D geometry to pre-train GNNs. Implemented in DGL and Pytorch Geometric.

PyTorch implementation for paper "Full-Body Visual Self-Modeling of Robot Morphologies".

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

The codes for the work "Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation"

Simple ONNX operation generator. Simple Operation Generator for ONNX.

Single Image Random Dot Stereogram for Tensorflow

High dimensional black-box optimizer using Latent Action Monte Carlo Tree Search algorithm

An official source code for "Augmentation-Free Self-Supervised Learning on Graphs"

Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Duke Machine Learning Winter School: Computer Vision 2022

This project is for a Twitter bot that monitors a bird feeder in my backyard. Any detected birds are identified and posted to Twitter.

This is the official implementation of VaxNeRF (Voxel-Accelearated NeRF).

A Strong Baseline for Image Semantic Segmentation

PyTorch implementation of 'Gen-LaneNet: a generalized and scalable approach for 3D lane detection'

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Invariant Causal Prediction for Block MDPs