Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Last update: Sep 20, 2022

Related tags

Deep Learning LIID

Overview

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

This paper has been accepted and early accessed in IEEE TPAMI 2020.

Code contact e-mail: Yu-Huan Wu (wuyuhuan (at) mail(dot)nankai(dot)edu(dot)cn)

Introduction

Weakly supervised semantic instance segmentation with only image-level supervision, instead of relying on expensive pixel-wise masks or bounding box annotations, is an important problem to alleviate the data-hungry nature of deep learning. In this paper, we tackle this challenging problem by aggregating the image-level information of all training images into a large knowledge graph and exploiting semantic relationships from this graph. Specifically, our effort starts with some generic segment-based object proposals (SOP) without category priors. We propose a multiple instance learning (MIL) framework, which can be trained in an end-to-end manner using training images with image-level labels. For each proposal, this MIL framework can simultaneously compute probability distributions and category-aware semantic features, with which we can formulate a large undirected graph. The category of background is also included in this graph to remove the massive noisy object proposals. An optimal multi-way cut of this graph can thus assign a reliable category label to each proposal. The denoised SOP with assigned category labels can be viewed as pseudo instance segmentation of training images, which are used to train fully supervised models. The proposed approach achieves state-of-the-art performance for both weakly supervised instance segmentation and semantic segmentation.

Citations

If you are using the code/model/data provided here in a publication, please consider citing:

@article{liu2020leveraging,
  title={Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation},
  author={Yun Liu and Yu-Huan Wu and Peisong Wen and Yujun Shi and Yu Qiu and Ming-Ming Cheng},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2020},
  doi={10.1109/TPAMI.2020.3023152},
  publisher={IEEE}
}

Requirements

Python 3.5, PyTorch 0.4.1, Torchvision 0.2.2.post3, CUDA 9.0
Validated on Ubuntu 16.04, NVIDIA TITAN Xp

Testing LIID

Clone the LIID repository

git clone https://github.com/yun-liu/LIID.git

Download the pretrained model of the MIL framework, and put them into $ROOT_DIR folder.
Download the Pascal VOC2012 dataset. Extract the dataset files into $VOC2012_ROOT folder.
Download the segment-based object proposals, and extract the data into $VOC2012_ROOT/proposals/ folder.
Download the compiled binary files, and put the binary files into $ROOT_DIR/cut/multiway_cut/.
Change the path in cut/run.sh to your own project root.
run ./make.sh to build CUDA dependences.
Run python3 gen_proposals.py. Remember to change the voc-root to your own $VOC2012_ROOT. The proposals with labels will be generated in the $ROOT_DIR/proposals folder.

Pretrained Models and data

The pretrained model of the MIL framework can be downloaded here.

The Pascal VOC2012 dataset can be downloaded here or other mirror websites.

S4Net proposals used for testing can be downloaded here.

The 24K simple ImageNet data (including S4Net proposals) can be downloaded here.

MCG proposals can be downloaded here.

Training with Pseudo Labels

For instance segmentation, you can use official or popular public Mask R-CNN projects like mmdetecion, Detectron2, maskrcnn-benchmark, or other popular open-source projects.

For semantic segmentation, you can use official Caffe implementation of deeplab, third-party PyTorch implementation here, or third-party Tensorflow Implementation here.

Precomputed Results

Results of instance segmentation on the Pascal VOC2012 segmentation val split can be downloaded here.

Results of semantic segmentation trained with 10K images, 10K images + 24K simple ImageNet images, 10K images (Res2Net-101) on the Pascal VOC2012 segmentation val split can be downloaded here.

Other Notes

Since it is difficult to install and configure IBM CPLEX, for convenience, we provide the compiled binary file which can run directly. If you desire to get the complete source code for solving the multi-way cut and ensure that there is no commercial use of it, please contact Yu-Huan Wu (wuyuhuan (at) mail(dot)nankai(dot)edu(dot)cn).

Acknowledgment

This code is based on IBM CPLEX. Thanks to the IBM CPLEX academic version.

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Related tags

Overview

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Introduction

Citations

Requirements

Testing LIID

Pretrained Models and data

Training with Pseudo Labels

Precomputed Results

Other Notes

Acknowledgment

Owner

Yun Liu

Pytorch cuda extension of grid_sample1d

MISSFormer: An Effective Medical Image Segmentation Transformer

Official Repository for our ECCV2020 paper: Imbalanced Continual Learning with Partitioning Reservoir Sampling

THIS IS THE OLD PYMC PROJECT. PLEASE USE PYMC3 INSTEAD:

AI assistant built in python.the features are it can display time,say weather,open-google,youtube,instagram.

Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Manifold-Mixup implementation for fastai V2

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

some classic model used to segment the medical images like CT、X-ray and so on

[CVPR'21] DeepSurfels: Learning Online Appearance Fusion

Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*

Recursive Bayesian Networks

A JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short.

[ICLR'19] Trellis Networks for Sequence Modeling

Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments

Text Generation by Learning from Demonstrations

Denoising Normalizing Flow

Semantic Segmentation of images using PixelLib with help of Pascalvoc dataset trained with Deeplabv3+ framework.

Code repository for our paper regarding the L3D dataset.

PyTorch META-DATASET (Few-shot classification benchmark)

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Related tags

Overview

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Introduction

Citations

Requirements

Testing LIID

Pretrained Models and data

Training with Pseudo Labels

Precomputed Results

Other Notes

Acknowledgment

Owner

Yun Liu

Pytorch cuda extension of grid_sample1d

MISSFormer: An Effective Medical Image Segmentation Transformer

Official Repository for our ECCV2020 paper: Imbalanced Continual Learning with Partitioning Reservoir Sampling

THIS IS THE **OLD** PYMC PROJECT. PLEASE USE PYMC3 INSTEAD:

AI assistant built in python.the features are it can display time,say weather,open-google,youtube,instagram.

Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Manifold-Mixup implementation for fastai V2

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

some classic model used to segment the medical images like CT、X-ray and so on

[CVPR'21] DeepSurfels: Learning Online Appearance Fusion

Developed an optimized algorithm which finds the most optimal path between 2 points in a 3D Maze using various AI search techniques like BFS, DFS, UCS, Greedy BFS and A*

Recursive Bayesian Networks

A JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short.

[ICLR'19] Trellis Networks for Sequence Modeling

Fusion-DHL: WiFi, IMU, and Floorplan Fusion for Dense History of Locations in Indoor Environments

Text Generation by Learning from Demonstrations

Denoising Normalizing Flow

Semantic Segmentation of images using PixelLib with help of Pascalvoc dataset trained with Deeplabv3+ framework.

Code repository for our paper regarding the L3D dataset.

PyTorch META-DATASET (Few-shot classification benchmark)

THIS IS THE OLD PYMC PROJECT. PLEASE USE PYMC3 INSTEAD: