Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth

Last update: Dec 07, 2022

Related tags

Overview

Instance segmentation by jointly optimizing spatial embeddings and clustering bandwidth

This codebase implements the loss function described in:

Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth Davy Neven, Bert De Brabandere, Marc Proesmans, and Luc Van Gool Conference on Computer Vision and Pattern Recognition (CVPR), june 2019

Our network architecture is a multi-branched version of ERFNet and uses the Lovasz-hinge loss for maximizing the IoU of each instance.

License

This software is released under a creative commons license which allows for personal and research use only. For a commercial license please contact the authors. You can view a license summary here.

Getting started

This codebase showcases the proposed loss function on car instance segmentation using the Cityscapes dataset.

Prerequisites

Dependencies:

Pytorch 1.1
Python 3.6.8 (or higher)
Cityscapes + scripts (if you want to evaluate the model)

Training

Training consists out of 2 steps. We first train on 512x512 crops around each object, to avoid computation on background patches. Afterwards, we finetune on larger patches (1024x1024) to account for bigger objects and background features which are not present in the smaller crops.

To generate these crops do the following:

$ CITYSCAPES_DIR=/path/to/cityscapes/ python utils/generate_crops.py

Afterwards start training:

$ CITYSCAPES_DIR=/path/to/cityscapes/ python train.py

Different options can be modified in train_config.py, e.g. to visualize set display=True.

Testing

You can download a pretrained model here. Save this file in the src/pretrained_models/ or adapt the test_config.py file.

To test the model on the Cityscapes validation set run:

$ CITYSCAPES_DIR=/path/to/cityscapes/ python test.py

The pretrained model gets 56.4 AP on the car validation set.

Acknowledgement

This work was supported by Toyota, and was carried out at the TRACE Lab at KU Leuven (Toyota Research on Automated Cars in Europe - Leuven)

Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth

Related tags

Overview

Instance segmentation by jointly optimizing spatial embeddings and clustering bandwidth

License

Getting started

Prerequisites

Training

Testing

Acknowledgement

Owner

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

All materials of Cassandra Event, Udyam'22

Unofficial implementation of Fast-SCNN: Fast Semantic Segmentation Network

GANSketchingJittor - Implementation of Sketch Your Own GAN in Jittor

Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Regulatory Instruments for Fair Personalized Pricing.

Manim is an engine for precise programmatic animations, designed for creating explanatory math videos

Look Closer: Bridging Egocentric and Third-Person Views with Transformers for Robotic Manipulation

The first machine learning framework that encourages learning ML concepts instead of memorizing class functions.

Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

TC-GNN with Pytorch integration

SHIFT15M: multiobjective large-scale fashion dataset with distributional shifts

Code repo for "Cross-Scale Internal Graph Neural Network for Image Super-Resolution" (NeurIPS'20)

Neural Style and MSG-Net

Repository for code and dataset for our EMNLP 2021 paper - “So You Think You’re Funny?”: Rating the Humour Quotient in Standup Comedy.

Kaggle-titanic - A tutorial for Kaggle's Titanic: Machine Learning from Disaster competition. Demonstrates basic data munging, analysis, and visualization techniques. Shows examples of supervised machine learning techniques.

Using Self-Supervised Pretext Tasks for Active Learning - Official Pytorch Implementation

A PyTorch Implementation of Single Shot MultiBox Detector

Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR

VOLO: Vision Outlooker for Visual Recognition