(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

Overview

RDPNet

IEEE TIP 2021: Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

PyTorch training and testing code are available. We have achieved SOTA performance on the salient instance segmentation (SIS) task.

If you run into any problems or feel any difficulties to run this code, do not hesitate to leave issues in this repository.

My e-mail is: wuyuhuan @ mail.nankai (dot) edu.cn

[Official Ver.] [PDF]

Citations

If you are using the code/model/data provided here in a publication, please consider citing:

@article{wu2021regularized,
   title={Regularized Densely-Connected Pyramid Network for Salient Instance Segmentation},
   volume={30},
   ISSN={1941-0042},
   DOI={10.1109/tip.2021.3065822},
   journal={IEEE Transactions on Image Processing},
   publisher={Institute of Electrical and Electronics Engineers (IEEE)},
   author={Wu, Yu-Huan and Liu, Yun and Zhang, Le and Gao, Wang and Cheng, Ming-Ming},
   year={2021},
   pages={3897–3907}
}

Requirements

  • PyTorch 1.1/1.0.1, Torchvision 0.2.2.post3, CUDA 9.0/10.0/10.1, apex
  • Validated on Ubuntu 16.04/18.04, PyTorch 1.1/1.0.1, CUDA 9.0/10.0/10.1, NVIDIA TITAN Xp

Installing

Please check INSTALL.md.

Note: we have provided an early tested apex version (url: here) and place it in our root folder (./apex/). You can also try other apex versions, which are not tested by us.

Data

Before training/testing our network, please download the data: [Google Drive, 0.7G], [Baidu Yun, yhwu].

The above zip file contains data of the ISOD and SOC dataset.

Note: if you are blocked by Google and Baidu services, you can contact me via e-mail and I will send you a copy of data and model weights.

We have processed the data to json format so you can use them without any preprocessing steps. After completion of downloading, extract the data and put them to ./datasets/ folder. Then, the ./datasets/ folder should contain two folders: isod/, soc/.

Train

It is very simple to train our network. We have prepared a script to run the training step. You can at first train our ResNet-50-based network on the ISOD dataset:

cd scripts
bash ./train_isod.sh

The training step should cost less than 1 hour for single GTX 1080Ti or TITAN Xp. This script will also store the network code, config file, log, and model weights.

We also provide ResNet-101 and ResNeXt-101 training scripts, and they are all in the scripts folder.

The default training code is for single gpu training since the training time is very low. You can also try multi gpus training by replacing --nproc_per_node=1 \ with --nproc_per_node=2 \ for 2-gpu training.

Test / Evaluation / Results

It is also very simple to test our network. First you need to download the model weights:

Taking the test on the ISOD dataset for example:

  1. Download the ISOD trained model weights, put it to model_zoo/ folder.
  2. cd the scripts folder, then run bash test_isod.sh.
  3. Testing step usually costs less than a minute. We use the official cocoapi for evaluation.

Note1: We strongly recommend to use cocoapi to evaluate the performance. Such evaluation is also automatically done with the testing process.

Note2: Default cocoapi evaluation outputs AP, AP50, AP75 peformance. To output the score of AP70, you need to change the cocoeval.py in cocoapi. See changes in this commitment:

BEFORE: stats[2] = _summarize(1, iouThr=.75, maxDets=self.params.maxDets[2])
AFTER:  stats[2] = _summarize(1, iouThr=.70, maxDets=self.params.maxDets[2])

Note3: If you are not familiar with the evalutation metric AP, AP50, AP75, you can refer to the introduction website here. Our official paper also introduces them in the Experiments section.

Visualize

We provide a simple python script to visualize the result: demo/visualize.py.

  1. Be sure that you have downloaded the ISOD pretrained weights [Google Drive, 0.14G].
  2. Put images to the demo/examples/ folder. I have prepared some images in this paper so do not worry that you have no images.
  3. cd demo, run python visualize.py
  4. Visualized images are generated in the same folder. You can change the target folder in visualize.py.

TODO

  1. Release the weights for real-world applications
  2. Add Jittor implementation
  3. Train with the enhanced base detector (FCOS TPAMI version) for better performance. Currently the base detector is the FCOS conference version with a bit lower performance.

Other Tips

I am free to answer your question if you are interested in salient instance segmentation. I also encourage everyone to contact me via my e-mail. My e-mail is: wuyuhuan @ mail.nankai (dot) edu.cn

Acknowlogdement

This repository is built under the help of the following three projects for academic use only:

Owner
Yu-Huan Wu
Ph.D. student at Nankai University
Yu-Huan Wu
[ICLR 2022] Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics

CPDeform Code and data for paper Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics at ICLR 2022 (Spotlight). @InProceed

(Lester) Sizhe Li 29 Nov 29, 2022
TLDR; Train custom adaptive filter optimizers without hand tuning or extra labels.

AutoDSP TLDR; Train custom adaptive filter optimizers without hand tuning or extra labels. About Adaptive filtering algorithms are commonplace in sign

Jonah Casebeer 48 Sep 19, 2022
Official Implement of CVPR 2021 paper “Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting”

RGBT Crowd Counting Lingbo Liu, Jiaqi Chen, Hefeng Wu, Guanbin Li, Chenglong Li, Liang Lin. "Cross-Modal Collaborative Representation Learning and a L

37 Dec 08, 2022
Lacmus is a cross-platform application that helps to find people who are lost in the forest using computer vision and neural networks.

lacmus The program for searching through photos from the air of lost people in the forest using Retina Net neural nwtwork. The project is being develo

Lacmus Foundation 168 Dec 27, 2022
OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021) Video demo We here provide a video demo from co

20 Nov 25, 2022
🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

🦙 LaMa Image Inpainting, Resolution-robust Large Mask Inpainting with Fourier Convolutions, WACV 2022

Advanced Image Manipulation Lab @ Samsung AI Center Moscow 4.7k Dec 31, 2022
Chunkmogrify: Real image inversion via Segments

Chunkmogrify: Real image inversion via Segments Teaser video with live editing sessions can be found here This code demonstrates the ideas discussed i

David Futschik 112 Jan 04, 2023
A Next Generation ConvNet by FaceBookResearch Implementation in PyTorch(Original) and TensorFlow.

ConvNeXt A Next Generation ConvNet by FaceBookResearch Implementation in PyTorch(Original) and TensorFlow. A FacebookResearch Implementation on A Conv

Raghvender 2 Feb 14, 2022
Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers.

Less is More: Pay Less Attention in Vision Transformers Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers. By

73 Jan 01, 2023
[SIGGRAPH 2022 Journal Track] AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars

AvatarCLIP: Zero-Shot Text-Driven Generation and Animation of 3D Avatars Fangzhou Hong1*  Mingyuan Zhang1*  Liang Pan1  Zhongang Cai1,2,3  Lei Yang2 

Fangzhou Hong 749 Jan 04, 2023
Really awesome semantic segmentation

really-awesome-semantic-segmentation A list of all papers on Semantic Segmentation and the datasets they use. This site is maintained by Holger Caesar

Holger Caesar 400 Nov 28, 2022
A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

Vikash Sehwag 65 Dec 19, 2022
Links to works on deep learning algorithms for physics problems, TUM-I15 and beyond

Links to works on deep learning algorithms for physics problems, TUM-I15 and beyond

Nils Thuerey 1.3k Jan 08, 2023
Remote sensing change detection using PaddlePaddle

Change Detection Laboratory Developing and benchmarking deep learning-based remo

Lin Manhui 15 Sep 23, 2022
Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks

Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks (SDPoint) This repository contains the cod

Jason Kuen 17 Jul 04, 2022
Offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation

Shunted Transformer This is the offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation by Sucheng Ren, Daquan Zhou, Shengf

156 Dec 27, 2022
Synthetic LiDAR sequential point cloud dataset with point-wise annotations

SynLiDAR dataset: Learning From Synthetic LiDAR Sequential Point Cloud This is official repository of the SynLiDAR dataset. For technical details, ple

78 Dec 27, 2022
This is the source code for generating the ASL-Skeleton3D and ASL-Phono datasets. Check out the README.md for more details.

ASL-Skeleton3D and ASL-Phono Datasets Generator The ASL-Skeleton3D contains a representation based on mapping into the three-dimensional space the coo

Cleison Amorim 5 Nov 20, 2022
CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing

CapsuleVOS This is the code for the ICCV 2019 paper CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing. Arxiv Link: https://a

53 Oct 27, 2022
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion Yinghao Aaron Li, Ali Zare, Nima Mesgarani We pres

Aaron (Yinghao) Li 282 Jan 01, 2023