(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

Last update: Oct 21, 2022

Overview

RDPNet

IEEE TIP 2021: Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

PyTorch training and testing code are available. We have achieved SOTA performance on the salient instance segmentation (SIS) task.

If you run into any problems or feel any difficulties to run this code, do not hesitate to leave issues in this repository.

My e-mail is: wuyuhuan @ mail.nankai (dot) edu.cn

[Official Ver.] [PDF]

Citations

If you are using the code/model/data provided here in a publication, please consider citing:

@article{wu2021regularized,
   title={Regularized Densely-Connected Pyramid Network for Salient Instance Segmentation},
   volume={30},
   ISSN={1941-0042},
   DOI={10.1109/tip.2021.3065822},
   journal={IEEE Transactions on Image Processing},
   publisher={Institute of Electrical and Electronics Engineers (IEEE)},
   author={Wu, Yu-Huan and Liu, Yun and Zhang, Le and Gao, Wang and Cheng, Ming-Ming},
   year={2021},
   pages={3897–3907}
}

Requirements

PyTorch 1.1/1.0.1, Torchvision 0.2.2.post3, CUDA 9.0/10.0/10.1, apex
Validated on Ubuntu 16.04/18.04, PyTorch 1.1/1.0.1, CUDA 9.0/10.0/10.1, NVIDIA TITAN Xp

Installing

Please check INSTALL.md.

Note: we have provided an early tested apex version (url: here) and place it in our root folder (./apex/). You can also try other apex versions, which are not tested by us.

Data

Before training/testing our network, please download the data: [Google Drive, 0.7G], [Baidu Yun, yhwu].

The above zip file contains data of the ISOD and SOC dataset.

Note: if you are blocked by Google and Baidu services, you can contact me via e-mail and I will send you a copy of data and model weights.

We have processed the data to json format so you can use them without any preprocessing steps. After completion of downloading, extract the data and put them to ./datasets/ folder. Then, the ./datasets/ folder should contain two folders: isod/, soc/.

Train

It is very simple to train our network. We have prepared a script to run the training step. You can at first train our ResNet-50-based network on the ISOD dataset:

cd scripts
bash ./train_isod.sh

The training step should cost less than 1 hour for single GTX 1080Ti or TITAN Xp. This script will also store the network code, config file, log, and model weights.

We also provide ResNet-101 and ResNeXt-101 training scripts, and they are all in the scripts folder.

The default training code is for single gpu training since the training time is very low. You can also try multi gpus training by replacing --nproc_per_node=1 \ with --nproc_per_node=2 \ for 2-gpu training.

Test / Evaluation / Results

It is also very simple to test our network. First you need to download the model weights:

ResNet-50 (ISOD dataset): [Google Drive, 0.14G], [Baidu Yun, yhwu]
ResNet-50 (SOC dataset): [Google Drive, 0.14G], [Baidu Yun, yhwu]

Taking the test on the ISOD dataset for example:

Download the ISOD trained model weights, put it to model_zoo/ folder.
cd the scripts folder, then run bash test_isod.sh.
Testing step usually costs less than a minute. We use the official cocoapi for evaluation.

Note1: We strongly recommend to use cocoapi to evaluate the performance. Such evaluation is also automatically done with the testing process.

Note2: Default cocoapi evaluation outputs AP, AP50, AP75 peformance. To output the score of AP70, you need to change the cocoeval.py in cocoapi. See changes in this commitment:

BEFORE: stats[2] = _summarize(1, iouThr=.75, maxDets=self.params.maxDets[2])
AFTER:  stats[2] = _summarize(1, iouThr=.70, maxDets=self.params.maxDets[2])

Note3: If you are not familiar with the evalutation metric AP, AP50, AP75, you can refer to the introduction website here. Our official paper also introduces them in the Experiments section.

Visualize

We provide a simple python script to visualize the result: demo/visualize.py.

Be sure that you have downloaded the ISOD pretrained weights [Google Drive, 0.14G].
Put images to the demo/examples/ folder. I have prepared some images in this paper so do not worry that you have no images.
cd demo, run python visualize.py
Visualized images are generated in the same folder. You can change the target folder in visualize.py.

TODO

Release the weights for real-world applications
Add Jittor implementation
Train with the enhanced base detector (FCOS TPAMI version) for better performance. Currently the base detector is the FCOS conference version with a bit lower performance.

Other Tips

I am free to answer your question if you are interested in salient instance segmentation. I also encourage everyone to contact me via my e-mail. My e-mail is: wuyuhuan @ mail.nankai (dot) edu.cn

Acknowlogdement

This repository is built under the help of the following three projects for academic use only:

(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

Related tags

Overview

RDPNet

Citations

Requirements

Installing

Data

Train

Test / Evaluation / Results

Visualize

TODO

Other Tips

Acknowlogdement

Owner

Yu-Huan Wu

Codes of paper "Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling"

Code for "R-GCN: The R Could Stand for Random"

This is the implementation of GGHL (A General Gaussian Heatmap Labeling for Arbitrary-Oriented Object Detection)

A state-of-the-art semi-supervised method for image recognition

Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy" (ICLR 2022 Spotlight)

Benchmark library for high-dimensional HPO of black-box models based on Weighted Lasso regression

State of the Art Neural Networks for Deep Learning

Code repository for the paper "Tracking People with 3D Representations"

Versatile Generative Language Model

PyTorch implementation of our paper How robust are discriminatively trained zero-shot learning models?

Code to reproduce the experiments from our NeurIPS 2021 paper " The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective"

The official PyTorch code implementation of "Human Trajectory Prediction via Counterfactual Analysis" in ICCV 2021.

Official Pytorch and JAX implementation of "Efficient-VDVAE: Less is more"

Understanding Convolutional Neural Networks from Theoretical Perspective via Volterra Convolution

Video Autoencoder: self-supervised disentanglement of 3D structure and motion

Code of paper "Compositionally Generalizable 3D Structure Prediction"

Official implementation for the paper: "Multi-label Classification with Partial Annotations using Class-aware Selective Loss"

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

The personal repository of the work: DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer.

Compositional and Parameter-Efficient Representations for Large Knowledge Graphs

(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

Related tags

Overview

RDPNet

Citations

Requirements

Installing

Data

Train

Test / Evaluation / Results

Visualize

TODO

Other Tips

Acknowlogdement

Owner

Yu-Huan Wu

Codes of paper "Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling"

Code for "R-GCN: The R Could Stand for Random"

This is the implementation of GGHL (A General Gaussian Heatmap Labeling for Arbitrary-Oriented Object Detection)

A state-of-the-art semi-supervised method for image recognition

Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy" (ICLR 2022 Spotlight)

Benchmark library for high-dimensional HPO of black-box models based on Weighted Lasso regression

State of the Art Neural Networks for Deep Learning

Code repository for the paper "Tracking People with 3D Representations"

Versatile Generative Language Model

PyTorch implementation of our paper How robust are discriminatively trained zero-shot learning models?

Code to reproduce the experiments from our NeurIPS 2021 paper " The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective"

The official PyTorch code implementation of "Human Trajectory Prediction via Counterfactual Analysis" in ICCV 2021.

Official Pytorch and JAX implementation of "Efficient-VDVAE: Less is more"

Understanding Convolutional Neural Networks from Theoretical Perspective via Volterra Convolution

Video Autoencoder: self-supervised disentanglement of 3D structure and motion

Code of paper "Compositionally Generalizable 3D Structure Prediction"

Official implementation for the paper: "Multi-label Classification with Partial Annotations using Class-aware Selective Loss"

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

The personal repository of the work: *DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer*.

Compositional and Parameter-Efficient Representations for Large Knowledge Graphs

The personal repository of the work: DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer.