FCN-semantic-segmentation

Simple end-to-end semantic segmentation using fully convolutional networks [1]. Takes a pretrained 34-layer ResNet [2], removes the fully connected layers, and adds transposed convolution layers with skip connections from lower layers. Initialises upsampling convolutions with bilinear interpolation filters and zeros the final (classification) layer.

Uses an independent cross-entropy loss per class. Trained with SGD with momentum, plus weight decay only on convolutional weights. Calculates and plots class-wise and mean intersection-over-union. Checkpoints the network every epoch.

Note: This code does not achieve great results (achieves ~40 IoU fairly quickly, but converges there). Contributions to fix this are welcome! The goal of this repo is to provide strong, simple and efficient baselines for semantic segmentation using the FCN method, so this shouldn't be restricted to using ResNet 34 etc.

Requirements

Instructions

Install all of the required software. To feasibly run the training, CUDA is needed. The crop size and batch size can be tailored to your GPU memory (the default crop and batch sizes use ~10GB of GPU RAM).
Register on the Cityscapes website to access the dataset.
Download and extract the training/validation RGB data (leftImg8bit_trainvaltest) and ground truth data (gtFine_trainvaltest).
Run python main.py <options>.

First a Dataset object is set up, returning the RGB inputs, one-hot targets (for independent classification) and label targets. During training, the images are randomly cropped and horizontally flipped. Testing calculates IoU scores and produces a subset of coloured predictions that match the coloured ground truth.

References

[1] Fully convolutional networks for semantic segmentation
[2] Deep Residual Learning for Image Recognition

Fully convolutional networks for semantic segmentation

Related tags

Overview

FCN-semantic-segmentation

Requirements

Instructions

References

Owner

Kai Arulkumaran

JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 130+ Indicators

Densely Connected Search Space for More Flexible Neural Architecture Search (CVPR2020)

Deep learning algorithms for muon momentum estimation in the CMS Trigger System

Scalable implementation of Lee / Mykland (2012) and Ait-Sahalia / Jacod (2012) Jump tests for noisy high frequency data

A PyTorch implementation of "ANEMONE: Graph Anomaly Detection with Multi-Scale Contrastive Learning", CIKM-21

Pytorch implementation of Bert and Pals: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning

Official PyTorch implementation of the paper "TEMOS: Generating diverse human motions from textual descriptions"

Official implementation of Monocular Quasi-Dense 3D Object Tracking

This is an example of object detection on Micro bacterium tuberculosis using Mask-RCNN

Pytorch implementation of Compressive Transformers, from Deepmind

Face Recognition Attendance Project

Depth image based mouse cursor visual haptic

Official Code for VideoLT: Large-scale Long-tailed Video Recognition (ICCV 2021)

The code written during my Bachelor Thesis "Classification of Human Whole-Body Motion using Hidden Markov Models".

TeST: Temporal-Stable Thresholding for Semi-supervised Learning

Breaking the Curse of Space Explosion: Towards Efficient NAS with Curriculum Search

This folder contains the python code of UR5E's advanced forward kinematics model.

Approaches to modeling terrain and maps in python