A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains (IJCV submission)

Overview

wsss-analysis

The code of: A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains, arXiv pre-print 2019 paper.

Introduction

We conduct the first comprehensive analysis of Weakly-Supervised Semantic Segmentation (WSSS) with image label supervision in different image domains. WSSS has been almost exclusively evaluated on PASCAL VOC2012 but little work has been done on applying to different image domains, such as histopathology and satellite images. The paper analyzes the compatibility of different methods for representative datasets and presents principles for applying to an unseen dataset.

In this repository, we provide the evaluation code used to generate the weak localization cues and final segmentations from Section 5 (Performance Evaluation) of the paper. The code release enables reproducing the results in our paper. The Keras implementation of HistoSegNet was adapted from hsn_v1; the Tensorflow implementations of SEC and DSRG were adapted from SEC-tensorflow and DSRG-tensorflow, respectively. The PyTorch implementation of IRNet was adapted from irn. Pretrained models and evaluation images are also available for download.

Citing this repository

If you find this code useful in your research, please consider citing us:

    @article{chan2019comprehensive,
        title={A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains},
        author={Chan, Lyndon and Hosseini, Mahdi S. and Plataniotis, Konstantinos N.},
        journal={International Journal of Computer Vision},
        volume={},
        number={},
        pages={},
        year={2020},
        publisher={Springer}
    }

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

Mandatory

  • python (checked on 3.5)
  • scipy (checked on 1.2.0)
  • skimage / scikit-image (checked on 0.15.0)
  • keras (checked on 2.2.4)
  • tensorflow (checked on 1.13.1)
  • tensorflow-gpu (checked on 1.13.1)
  • numpy (checked on 1.18.1)
  • pandas (checked on 0.23.4)
  • cv2 / opencv-python (checked on 3.4.4.19)
  • cython
  • imageio (checked on 2.5.0)
  • chainercv (checked on 0.12.0)
  • pydensecrf (git+https://github.com/lucasb-eyer/pydensecrf.git)
  • torch (checked on 1.1.0)
  • torchvision (checked on 0.2.2.post3)
  • tqdm

Optional

  • matplotlib (checked on 3.0.2)
  • jupyter

To utilize the code efficiently, GPU support is required. The following configurations have been tested to work successfully:

  • CUDA Version: 10
  • CUDA Driver Version: r440
  • CUDNN Version: 7.6.4 - 7.6.5 We do not guarantee proper functioning of the code using different versions of CUDA or CUDNN.

Hardware Requirements

Each method used in this repository has different GPU memory requirements. We have listed the approximate GPU memory requirements for each model through our own experiments:

  • 01_train: ~6 GB (e.g. NVIDIA RTX 2060)
  • 02_cues: ~6 GB (e.g. NVIDIA RTX 2060)
  • 03a_sec-dsrg: ~11 GB (e.g. NVIDIA GTX 2080 Ti)
  • 03b_irn: ~8 GB (e.g. NVIDIA GTX 1070)
  • 03c_hsn: ~6 GB (e.g. NVIDIA RTX 2060)

Downloading data

The pretrained models, ground-truth annotations, and images used in this paper are available on Zenodo under a Creative Commons Attribution license: DOI. Please extract the contents into your wsss-analysis\database directory. If you choose to extract the data to another directory, please modify the filepaths accordingly in settings.ini.

Note: the training-set images of ADP are released on a case-by-case basis due to the confidentiality agreement for releasing the data. To obtain access to wsss-analysis\database\ADPdevkit\ADPRelease1\JPEGImages and wsss-analysis\database\ADPdevkit\ADPRelease1\PNGImages needed for gen_cues in 01_weak_cues, apply for access separately here.

Running the code

Scripts

To run 02_cues (generate weak cues for SEC and DSRG):

cd 02_cues
python demo.py

To run 03a_sec-dsrg (train/evaluate SEC, DSRG performance in Section 5; to omit training, comment out lines 76-77 in 03a_sec-dsrg\demo.py):

cd 03a_sec-dsrg
python demo.py

To run 03b_irn (train/evaluate IRNet and Grad-CAM performance in Section 5):

cd 03b_irn
python demo_tune.py

To run 03b_irn (evaluate pre-trained Grad-CAM performance in Section 5):

cd 03b_irn
python demo_cam.py

To run 03b_irn (evaluate pre-trained IRNet performance in Section 5):

cd 03b_irn
python demo_sem_seg.py

To run 03c_hsn (evaluate HistoSegNet performance in Section 5):

cd 03c_hsn
python demo.py

Notebooks

03a_sec-dsrg:

03b_irn:

  • VGG16-IRNet on ADP-morph: (TODO)
  • VGG16-IRNet on ADP-func: (TODO)
  • VGG16-IRNet on VOC2012: (TODO)
  • VGG16-IRNet on DeepGlobe: (TODO)

03c_hsn:

Results

To access each method's evaluation results, check the associated eval (for numerical results) and out (for outputted images) folders. For easy access to all evaluated results, run scripts/extract_eval.py.

(NOTE: the numerical results obtained for SEC and DSRG DeepGlobe_balanced differ slightly from those reported in the paper due to retraining the models during code cleanup. Also, tuning is equivalent to the validation set and segtest is equivalent to the evaluation set in ADP. See hsn_v1 to replicate those results for ADP precisely.)

Network - - VGG16 - - - - X1.7/M7 - - - -
WSSS Method - - Grad-CAM SEC DSRG IRNet HistoSegNet Grad-CAM SEC DSRG IRNet HistoSegNet
Dataset Training Testing " " " " " " " " " "
ADP-morph train validation 0.14507 0.10730 0.08826 0.15068 0.13255 0.20997 0.13597 0.13458 0.21450 0.27546
ADP-morph train evaluation 0.14946 0.11409 0.08011 0.15546 0.16159 0.21426 0.13369 0.10835 0.21737 0.26156
ADP-func train validation 0.34813 0.28232 0.37193 0.35016 0.44215 0.35233 0.32216 0.28625 0.34730 0.50663
ADP-func train evaluation 0.38187 0.28097 0.44726 0.36318 0.44115 0.37910 0.30828 0.31734 0.38943 0.48020
VOC2012 train val 0.26262 0.37058 0.32129 0.31198 0.22707 0.14946 0.37629 0.35004 0.17844 0.09201
DeepGlobe training (75% test) evaluation (25% test) 0.28037 0.24005 0.28841 0.29405 0.24019 0.21260 0.24841 0.35258 0.24620 0.29398
DeepGlobe training (37.5% test) evaluation (25% test) 0.28083 0.25512 0.32017 0.29207 0.30410 0.22266 0.20050 0.26470 0.21303 0.21617

Examples

ADP-morph

ADP-func

VOC2012

DeepGlobe

TODO

  1. Improve comments and code documentation
  2. Add IRNet notebooks
  3. Clean up IRNet code
You might also like...
Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)
Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

CCAM (Unsupervised) Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localizati

[CVPR'22] Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast
[CVPR'22] Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast

wseg Overview The Pytorch implementation of Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast. [arXiv] Though image-level weakly

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation This paper has been accepted and early accessed

Cross-Image Region Mining with Region Prototypical Network for Weakly Supervised Segmentation
Cross-Image Region Mining with Region Prototypical Network for Weakly Supervised Segmentation

Cross-Image Region Mining with Region Prototypical Network for Weakly Supervised Segmentation The code of: Cross-Image Region Mining with Region Proto

Siamese-nn-semantic-text-similarity - A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task Synthetic Humans for Action Recognition, IJCV 2021
Synthetic Humans for Action Recognition, IJCV 2021

SURREACT: Synthetic Humans for Action Recognition from Unseen Viewpoints Gül Varol, Ivan Laptev and Cordelia Schmid, Andrew Zisserman, Synthetic Human

IJCAI2020 & IJCV 2020 :city_sunrise: Unsupervised Scene Adaptation with Memory Regularization in vivo
IJCAI2020 & IJCV 2020 :city_sunrise: Unsupervised Scene Adaptation with Memory Regularization in vivo

Seg_Uncertainty In this repo, we provide the code for the two papers, i.e., MRNet:Unsupervised Scene Adaptation with Memory Regularization in vivo, IJ

The implementation for the SportsCap (IJCV 2021)
The implementation for the SportsCap (IJCV 2021)

SportsCap: Monocular 3D Human Motion Capture and Fine-grained Understanding in Challenging Sports Videos ProjectPage | Paper | Video | Dataset (Part01

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.
Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set (CVPRW 2019). A PyTorch implementation.

Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set —— PyTorch implementation This is an unofficial offici

Comments
  • Incorrect Axis?

    Incorrect Axis?

    I think the axis=2 is wrong in this line. The docstring says the shape should be BxHxWxC, which would make axis=2 take the argmax over the width dimension, but I think you mean to take it over the class dimension. But seeing as how your code worked using axis=2 I assume it is not a mistake in the code but rather the docstring is incorrect. I guess the inputs to the function are using HxWxC dimensions.

    opened by hasoweh 1
  • Background class DeepGlobe

    Background class DeepGlobe

    Hi, I have a quick question. Are you using a background class in your 'cues' for the DeepGlobe dataset? If so, is this class representing areas in the CAM that are below the FG threshold (20%)?

    Thanks!

    opened by hasoweh 0
Releases(v2.0)
  • v2.0(Jun 21, 2020)

    Code repository corresponding to the second version of the arXiv pre-print: [v2] Tue, 12 May 2020 04:42:47 UTC (6,209 KB). Please note that four methods are evaluated in this version (SEC, DSRG, IRNet, HistoSegNet) with Grad-CAM providing the baseline. Performance is inferior to that reported in the first version of the pre-print.

    Source code(tar.gz)
    Source code(zip)
  • v1.1(Jun 21, 2020)

    Code repository corresponding to the first version of the arXiv pre-print: [v1] Tue, 24 Dec 2019 03:00:34 UTC (8,560 KB). Please note that three methods are evaluated in this version (SEC, DSRG, and HistoSegNet) with the baseline being the thresholded weak cues from Grad-CAM. Performance is inferior to that reported in subsequent versions of the pre-print.

    Source code(tar.gz)
    Source code(zip)
Owner
Lyndon Chan
Computer Vision, Natural Language Processing, Machine Learning | Data Scientist at Alphabyte Solutions (ECE MASc'20, University of Toronto)
Lyndon Chan
ReLoss - Official implementation for paper "Relational Surrogate Loss Learning" ICLR 2022

Relational Surrogate Loss Learning (ReLoss) Official implementation for paper "R

Tao Huang 31 Nov 22, 2022
A Comprehensive Study on Learning-Based PE Malware Family Classification Methods

A Comprehensive Study on Learning-Based PE Malware Family Classification Methods Datasets Because of copyright issues, both the MalwareBazaar dataset

8 Oct 21, 2022
Tensor-Based Quantum Machine Learning

TensorLy_Quantum TensorLy-Quantum is a Python library for Tensor-Based Quantum Machine Learning that builds on top of TensorLy and PyTorch. Website: h

TensorLy 85 Dec 03, 2022
🎁 3,000,000+ Unsplash images made available for research and machine learning

The Unsplash Dataset The Unsplash Dataset is made up of over 250,000+ contributing global photographers and data sourced from hundreds of millions of

Unsplash 2k Jan 03, 2023
DrWhy is the collection of tools for eXplainable AI (XAI). It's based on shared principles and simple grammar for exploration, explanation and visualisation of predictive models.

Responsible Machine Learning With Great Power Comes Great Responsibility. Voltaire (well, maybe) How to develop machine learning models in a responsib

Model Oriented 590 Dec 26, 2022
PyTorch implementations for our SIGGRAPH 2021 paper: Editable Free-viewpoint Video Using a Layered Neural Representation.

st-nerf We provide PyTorch implementations for our paper: Editable Free-viewpoint Video Using a Layered Neural Representation SIGGRAPH 2021 Jiakai Zha

Diplodocus 258 Jan 02, 2023
Official PyTorch implementation for "Low Precision Decentralized Distributed Training with Heterogenous Data"

Low Precision Decentralized Training with Heterogenous Data Official PyTorch implementation for "Low Precision Decentralized Distributed Training with

Aparna Aketi 0 Nov 23, 2021
Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition

Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition Official implementation of the Efficient Conforme

Maxime Burchi 145 Dec 30, 2022
Pytorch implementation of the paper Time-series Generative Adversarial Networks

TimeGAN-pytorch Pytorch implementation of the paper Time-series Generative Adversarial Networks presented at NeurIPS'19. Jinsung Yoon, Daniel Jarrett

Zhiwei ZHANG 21 Nov 24, 2022
Sequential Model-based Algorithm Configuration

SMAC v3 Project Copyright (C) 2016-2018 AutoML Group Attention: This package is a reimplementation of the original SMAC tool (see reference below). Ho

AutoML-Freiburg-Hannover 778 Jan 05, 2023
AVD Quickstart Containerlab

AVD Quickstart Containerlab WARNING This repository is still under construction. It's fully functional, but has number of limitations. For example: RE

Carl Buchmann 3 Apr 10, 2022
Deep Learning Models for Causal Inference

Extensive tutorials for learning how to build deep learning models for causal inference using selection on observables in Tensorflow 2.

Bernard J Koch 151 Dec 31, 2022
Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer.

DocEnTR Description Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer. This model is implemented on to

Mohamed Ali Souibgui 74 Jan 07, 2023
Deep-Learning-Book-Chapter-Summaries - Attempting to make the Deep Learning Book easier to understand.

Deep-Learning-Book-Chapter-Summaries This repository provides a summary for each chapter of the Deep Learning book by Ian Goodfellow, Yoshua Bengio an

Aman Dalmia 1k Dec 27, 2022
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

ELECTRA Introduction ELECTRA is a method for self-supervised language representation learning. It can be used to pre-train transformer networks using

Google Research 2.1k Dec 28, 2022
Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

IterMVS official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo' Introduction IterMVS is a novel lear

Fangjinhua Wang 127 Jan 04, 2023
Python scripts to detect faces in Python with the BlazeFace Tensorflow Lite models

Python scripts to detect faces using Python with the BlazeFace Tensorflow Lite models. Tested on Windows 10, Tensorflow 2.4.0 (Python 3.8).

Ibai Gorordo 46 Nov 17, 2022
PyTorch implementation DRO: Deep Recurrent Optimizer for Structure-from-Motion

DRO: Deep Recurrent Optimizer for Structure-from-Motion This is the official PyTorch implementation code for DRO-sfm. For technical details, please re

Alibaba Cloud 56 Dec 12, 2022
Privacy-Preserving Portrait Matting [ACM MM-21]

Privacy-Preserving Portrait Matting [ACM MM-21] This is the official repository of the paper Privacy-Preserving Portrait Matting. Jizhizi Li∗, Sihan M

Jizhizi_Li 212 Dec 27, 2022
Detecting drunk people through thermal images using Deep Learning (CNN)

Drunk Detection CNN Detecting drunk people through thermal images using Deep Learning (CNN) Dataset We used thermal images provided by Electronics Lab

Giacomo Ferretti 3 Oct 27, 2022