Weakly Supervised Segmentation with Tensorflow. Implements instance segmentation as described in Simple Does It: Weakly Supervised Instance and Semantic Segmentation, by Khoreva et al. (CVPR 2017).

Overview

Weakly Supervised Segmentation with TensorFlow

This repo contains a TensorFlow implementation of weakly supervised instance segmentation as described in Simple Does It: Weakly Supervised Instance and Semantic Segmentation, by Khoreva et al. (CVPR 2017).

The idea behind weakly supervised segmentation is to train a model using cheap-to-generate label approximations (e.g., bounding boxes) as substitute/guiding labels for computer vision classification tasks that usually require very detailed labels. In semantic labelling, each image pixel is assigned to a specific class (e.g., boat, car, background, etc.). In instance segmentation, all the pixels belonging to the same object instance are given the same instance ID.

Per [2014a], pixelwise mask annotations are far more expensive to generate than object bounding box annotations (requiring up to 15x more time). Some models, like Simply Does It (SDI) [2016a] claim they can use a weak supervision approach to reach 95% of the quality of the fully supervised model, both for semantic labelling and instance segmentation.

Simple Does It (SDI)

Experimental Setup for Instance Segmentation

In weakly supervised instance segmentation, there are no pixel-wise annotations (i.e., no segmentation masks) that can be used to train a model. Yet, we aim to train a model that can still predict segmentation masks by only being given an input image and bounding boxes for the objects of interest in that image.

The masks used for training are generated starting from individual object bounding boxes. For each annotated bounding box, we generate a segmentation mask using the GrabCut method (although, any other method could be used), and train a convnet to regress from the image and bounding box information to the instance segmentation mask.

Note that in the original paper, a more sophisticated segmenter is used (M∩G+).

Network

SDI validates its work repurposing two different instance segmentation architectures (DeepMask [2015a] and DeepLab2 VGG-16 [2016b]). Here we use the OSVOS FCN (See section 3.1 of [2016c]).

Setup

The code in this repo was developed and tested using Anaconda3 v.4.4.0. To reproduce our conda environment, please use the following files:

On Ubuntu:

On Windows:

Jupyter Notebooks

The recommended way to test this implementation is to use the following jupyter notebooks:

  • VGG16 Net Surgery: The weakly supervised segmentation techniques presented in the "Simply Does It" paper use a backbone convnet (either DeepLab or VGG16 network) pre-trained on ImageNet. This pre-trained network takes RGB images as an input (W x H x 3). Remember that the weakly supervised version is trained using 4-channel inputs: RGB + a binary mask with a filled bounding box of the object instance. Therefore, we need to perform net surgery and create a 4-channel input version of the VGG16 net, initialized with the 3-channel parameter values except for the additional convolutional filters (we use Gaussian initialization for them).
  • "Simple Does It" Grabcut Training for Instance Segmentation: This notebook performs training of the SDI Grabcut weakly supervised model for instance segmentation. Following the instructions provided in Section "6. Instance Segmentation Results" of the "Simple Does It" paper, we use the Berkeley-augmented Pascal VOC segmentation dataset that provides per-instance segmentation masks for VOC2012 data. The Berkley augmented dataset can be downloaded from here. Again, the SDI Grabcut training is done using a 4-channel input VGG16 network pre-trained on ImageNet, so make sure to run the VGG16 Net Surgery notebook first!
  • "Simple Does It" Weakly Supervised Instance Segmentation (Testing): The sample results shown in the notebook come from running our trained model on the validation split of the Berkeley-augmented dataset.

Link to Pre-trained model and BK-VOC data files

The pre-processed BK-VOC dataset, "grabcut" segmentations, and results as well as pre-trained models (vgg_16_4chan_weak.ckpt-50000) can be found here:

If you'd rather download the Berkeley-augmented Pascal VOC segmentation dataset that provides per-instance segmentation masks for VOC2012 data from its origin, click here. Then, execute lines similar to these lines in dataset.py to generate the intermediary files used by this project:

if __name__ == '__main__':
    dataset = BKVOCDataset()
    dataset.prepare()

Make sure to set the paths at the top of dataset.py to the correct location:

if sys.platform.startswith("win"):
    _BK_VOC_DATASET = "E:/datasets/bk-voc/benchmark_RELEASE/dataset"
else:
    _BK_VOC_DATASET = '/media/EDrive/datasets/bk-voc/benchmark_RELEASE/dataset'

Training

The fully supervised version of the instance segmentation network whose performance we're trying to match is trained using the RGB images as inputs. The weakly supervised version is trained using 4-channel inputs: RGB + a binary mask with a filled bounding box of the object instance. In the latter case, the same RGB image may appear in several input samples (as many times as there are object instances associated with that RGB image).

To be clear, the output labels used for training are NOT user-provided detailed groundtruth annotations. There are no such groundtruths in the weakly supervised scenario. Instead, the labels are the segmentation masks generated using the GrabCut+ method. The weakly supoervised model is trained to regress from an image and bounding box information to a generated segmentation mask.

Testing

The sample results shown here come from running our trained model on the validation split of the Berkeley-augmented dataset (see the testing notebook). Below, we (very) subjectively categorize them as "pretty good" and "not so great".

Pretty good

Not so great

References

2016

  • [2016a] Khoreva et al. 2016. Simple Does It: Weakly Supervised Instance and Semantic Segmentation. [arXiv] [web]
  • [2016b] Chen et al. 2016. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. [arXiv]
  • [2016c] Caelles et al. 2016. OSVOS: One-Shot Video Object Segmentation. [arXiv]

2015

  • [2015a] Pinheiro et al. 2015. DeepMask: Learning to Segment Object Candidates. [arXiv]

2014

  • [2014a] Lin et al. 2014. Microsoft COCO: Common Objects in Context. [arXiv] [web]
Owner
Phil Ferriere
Former Microsoft Development Lead passionate about Deep Learning with a focus on Computer Vision.
Phil Ferriere
Certified Patch Robustness via Smoothed Vision Transformers

Certified Patch Robustness via Smoothed Vision Transformers This repository contains the code for replicating the results of our paper: Certified Patc

Madry Lab 35 Dec 14, 2022
Python scripts for performing object detection with the 1000 labels of the ImageNet dataset in ONNX.

Python scripts for performing object detection with the 1000 labels of the ImageNet dataset in ONNX. The repository combines a class agnostic object localizer to first detect the objects in the image

Ibai Gorordo 24 Nov 14, 2022
Data and code from COVID-19 machine learning paper

Machine learning approaches for localized lockdown, subnotification analysis and cases forecasting in São Paulo state counties during COVID-19 pandemi

Sara Malvar 4 Dec 22, 2022
Official Pytorch implementation of 6DRepNet: 6D Rotation representation for unconstrained head pose estimation.

6D Rotation Representation for Unconstrained Head Pose Estimation (Pytorch) Paper Thorsten Hempel and Ahmed A. Abdelrahman and Ayoub Al-Hamadi, "6D Ro

Thorsten Hempel 284 Dec 23, 2022
[NeurIPS'20] Multiscale Deep Equilibrium Models

Multiscale Deep Equilibrium Models 💥 💥 💥 💥 This repo is deprecated and we will soon stop actively maintaining it, as a more up-to-date (and simple

CMU Locus Lab 221 Dec 26, 2022
Prototype python implementation of the ome-ngff table spec

Prototype python implementation of the ome-ngff table spec

Kevin Yamauchi 8 Nov 20, 2022
A parametric soroban written with CADQuery.

A parametric soroban written in CADQuery The purpose of this project is to demonstrate how "code CAD" can be intuitive to learn. See soroban.py for a

Lee 4 Aug 13, 2022
Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022)

Pop-Out Motion Pop-Out Motion: 3D-Aware Image Deformation via Learning the Shape Laplacian (CVPR 2022) Jihyun Lee*, Minhyuk Sung*, Hyunjin Kim, Tae-Ky

Jihyun Lee 88 Nov 22, 2022
Deep Inertial Prediction (DIPr)

Deep Inertial Prediction For more information and context related to this repo, please refer to our website. Getting Started (non Docker) Note: you wi

Arcturus Industries 12 Nov 11, 2022
Official implementation for "Image Quality Assessment using Contrastive Learning"

Image Quality Assessment using Contrastive Learning Pavan C. Madhusudana, Neil Birkbeck, Yilin Wang, Balu Adsumilli and Alan C. Bovik This is the offi

Pavan Chennagiri 67 Dec 30, 2022
How to Become More Salient? Surfacing Representation Biases of the Saliency Prediction Model

How to Become More Salient? Surfacing Representation Biases of the Saliency Prediction Model

Bogdan Kulynych 49 Nov 05, 2022
Official and maintained implementation of the paper "OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data" [BMVC 2021].

OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data Christoph Reich, Tim Prangemeier, Özdemir Cetin & Heinz Koeppl | Pr

Christoph Reich 23 Sep 21, 2022
Data loaders and abstractions for text and NLP

torchtext This repository consists of: torchtext.datasets: The raw text iterators for common NLP datasets torchtext.data: Some basic NLP building bloc

3.2k Jan 08, 2023
Unofficial pytorch implementation of 'Image Inpainting for Irregular Holes Using Partial Convolutions'

pytorch-inpainting-with-partial-conv Official implementation is released by the authors. Note that this is an ongoing re-implementation and I cannot f

Naoto Inoue 525 Jan 01, 2023
Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker This repository contai

Nikita 12 Dec 14, 2022
Implementation of: "Exploring Randomly Wired Neural Networks for Image Recognition"

RandWireNN Unofficial PyTorch Implementation of: Exploring Randomly Wired Neural Networks for Image Recognition. Results Validation result on Imagenet

Seung-won Park 684 Nov 02, 2022
A TikTok-like recommender system for GitHub repositories based on Gorse

GitRec GitRec is the missing recommender system for GitHub repositories based on Gorse. Architecture The trending crawler crawls trending repositories

337 Jan 04, 2023
Code for Motion Representations for Articulated Animation paper

Motion Representations for Articulated Animation This repository contains the source code for the CVPR'2021 paper Motion Representations for Articulat

Snap Research 851 Jan 09, 2023
A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

A tour through tensorflow with financial data I present several models ranging in complexity from simple regression to LSTM and policy networks. The s

195 Dec 07, 2022
Adversarial Learning for Modeling Human Motion

Adversarial Learning for Modeling Human Motion This repository contains the open source code which reproduces the results for the paper: Adversarial l

wangqi 6 Jun 15, 2021