(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

Overview

RDPNet

IEEE TIP 2021: Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

PyTorch training and testing code are available. We have achieved SOTA performance on the salient instance segmentation (SIS) task.

If you run into any problems or feel any difficulties to run this code, do not hesitate to leave issues in this repository.

My e-mail is: wuyuhuan @ mail.nankai (dot) edu.cn

[Official Ver.] [PDF]

Citations

If you are using the code/model/data provided here in a publication, please consider citing:

@article{wu2021regularized,
   title={Regularized Densely-Connected Pyramid Network for Salient Instance Segmentation},
   volume={30},
   ISSN={1941-0042},
   DOI={10.1109/tip.2021.3065822},
   journal={IEEE Transactions on Image Processing},
   publisher={Institute of Electrical and Electronics Engineers (IEEE)},
   author={Wu, Yu-Huan and Liu, Yun and Zhang, Le and Gao, Wang and Cheng, Ming-Ming},
   year={2021},
   pages={3897–3907}
}

Requirements

  • PyTorch 1.1/1.0.1, Torchvision 0.2.2.post3, CUDA 9.0/10.0/10.1, apex
  • Validated on Ubuntu 16.04/18.04, PyTorch 1.1/1.0.1, CUDA 9.0/10.0/10.1, NVIDIA TITAN Xp

Installing

Please check INSTALL.md.

Note: we have provided an early tested apex version (url: here) and place it in our root folder (./apex/). You can also try other apex versions, which are not tested by us.

Data

Before training/testing our network, please download the data: [Google Drive, 0.7G], [Baidu Yun, yhwu].

The above zip file contains data of the ISOD and SOC dataset.

Note: if you are blocked by Google and Baidu services, you can contact me via e-mail and I will send you a copy of data and model weights.

We have processed the data to json format so you can use them without any preprocessing steps. After completion of downloading, extract the data and put them to ./datasets/ folder. Then, the ./datasets/ folder should contain two folders: isod/, soc/.

Train

It is very simple to train our network. We have prepared a script to run the training step. You can at first train our ResNet-50-based network on the ISOD dataset:

cd scripts
bash ./train_isod.sh

The training step should cost less than 1 hour for single GTX 1080Ti or TITAN Xp. This script will also store the network code, config file, log, and model weights.

We also provide ResNet-101 and ResNeXt-101 training scripts, and they are all in the scripts folder.

The default training code is for single gpu training since the training time is very low. You can also try multi gpus training by replacing --nproc_per_node=1 \ with --nproc_per_node=2 \ for 2-gpu training.

Test / Evaluation / Results

It is also very simple to test our network. First you need to download the model weights:

Taking the test on the ISOD dataset for example:

  1. Download the ISOD trained model weights, put it to model_zoo/ folder.
  2. cd the scripts folder, then run bash test_isod.sh.
  3. Testing step usually costs less than a minute. We use the official cocoapi for evaluation.

Note1: We strongly recommend to use cocoapi to evaluate the performance. Such evaluation is also automatically done with the testing process.

Note2: Default cocoapi evaluation outputs AP, AP50, AP75 peformance. To output the score of AP70, you need to change the cocoeval.py in cocoapi. See changes in this commitment:

BEFORE: stats[2] = _summarize(1, iouThr=.75, maxDets=self.params.maxDets[2])
AFTER:  stats[2] = _summarize(1, iouThr=.70, maxDets=self.params.maxDets[2])

Note3: If you are not familiar with the evalutation metric AP, AP50, AP75, you can refer to the introduction website here. Our official paper also introduces them in the Experiments section.

Visualize

We provide a simple python script to visualize the result: demo/visualize.py.

  1. Be sure that you have downloaded the ISOD pretrained weights [Google Drive, 0.14G].
  2. Put images to the demo/examples/ folder. I have prepared some images in this paper so do not worry that you have no images.
  3. cd demo, run python visualize.py
  4. Visualized images are generated in the same folder. You can change the target folder in visualize.py.

TODO

  1. Release the weights for real-world applications
  2. Add Jittor implementation
  3. Train with the enhanced base detector (FCOS TPAMI version) for better performance. Currently the base detector is the FCOS conference version with a bit lower performance.

Other Tips

I am free to answer your question if you are interested in salient instance segmentation. I also encourage everyone to contact me via my e-mail. My e-mail is: wuyuhuan @ mail.nankai (dot) edu.cn

Acknowlogdement

This repository is built under the help of the following three projects for academic use only:

Owner
Yu-Huan Wu
Ph.D. student at Nankai University
Yu-Huan Wu
Testability-Aware Low Power Controller Design with Evolutionary Learning, ITC2021

Testability-Aware Low Power Controller Design with Evolutionary Learning This repo contains the source code of Testability-Aware Low Power Controller

Lee Man 1 Dec 26, 2021
A high-level Python library for Quantum Natural Language Processing

lambeq About lambeq is a toolkit for quantum natural language processing (QNLP). Documentation: https://cqcl.github.io/lambeq/ User support: lambeq-su

Cambridge Quantum 315 Jan 01, 2023
Experimental Python implementation of OpenVINO Inference Engine (very slow, limited functionality). All codes are written in Python. Easy to read and modify.

PyOpenVINO - An Experimental Python Implementation of OpenVINO Inference Engine (minimum-set) Description The PyOpenVINO is a spin-off product from my

Yasunori Shimura 7 Oct 31, 2022
Lucid Sonic Dreams syncs GAN-generated visuals to music.

Lucid Sonic Dreams Lucid Sonic Dreams syncs GAN-generated visuals to music. By default, it uses NVLabs StyleGAN2, with pre-trained models lifted from

731 Jan 02, 2023
A series of convenience functions to make basic image processing operations such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV and Python.

imutils A series of convenience functions to make basic image processing functions such as translation, rotation, resizing, skeletonization, and displ

Adrian Rosebrock 4.3k Jan 08, 2023
Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

HLA-Face: Joint High-Low Adaptation for Low Light Face Detection The official PyTorch implementation for HLA-Face: Joint High-Low Adaptation for Low L

Wenjing Wang 77 Dec 08, 2022
Implementation of ICCV19 Paper "Learning Two-View Correspondences and Geometry Using Order-Aware Network"

OANet implementation Pytorch implementation of OANet for ICCV'19 paper "Learning Two-View Correspondences and Geometry Using Order-Aware Network", by

Jiahui Zhang 225 Dec 05, 2022
Code for our CVPR 2021 Paper "Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes".

Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes (CVPR 2021) Project page | Paper | Colab | Colab for Drawing App Rethinking Style

CompVis Heidelberg 153 Jan 04, 2023
Official repository for "Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems"

Action-Based Conversations Dataset (ABCD) This respository contains the code and data for ABCD (Chen et al., 2021) Introduction Whereas existing goal-

ASAPP Research 49 Oct 09, 2022
🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 | 한국어 State-of-the-art Natural Language Processing for Jax, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrai

Hugging Face 77.4k Jan 05, 2023
An example of time series augmentation methods with Keras

Time Series Augmentation This is a collection of time series data augmentation methods and an example use using Keras. News 2020/04/16: Repository Cre

九州大学 ヒューマンインタフェース研究室 229 Jan 02, 2023
Project to create an open-source 6 DoF input device

6DInputs A Project to create open-source 3D printed 6 DoF input devices Note the plural ('6DInputs' and 'devices') in the headings. We would like seve

RepRap Ltd 47 Jul 28, 2022
A multi-entity Transformer for multi-agent spatiotemporal modeling.

baller2vec This is the repository for the paper: Michael A. Alcorn and Anh Nguyen. baller2vec: A Multi-Entity Transformer For Multi-Agent Spatiotempor

Michael A. Alcorn 56 Nov 15, 2022
Controlling a game using mediapipe hand tracking

These scripts use the Google mediapipe hand tracking solution in combination with a webcam in order to send game instructions to a racing game. It features 2 methods of control

3 May 17, 2022
Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR

Official implementation for paper "Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR"

Ziyue Feng 72 Dec 09, 2022
Deepfake Scanner by Deepware.

Deepware Scanner (CLI) This repository contains the command-line deepfake scanner tool with the pre-trained models that are currently used at deepware

deepware 110 Jan 02, 2023
The Most Efficient Temporal Difference Learning Framework for 2048

moporgic/TDL2048+ TDL2048+ is a highly optimized temporal difference (TD) learning framework for 2048. Features Many common methods related to 2048 ar

Hung Guei 5 Nov 23, 2022
D2Go is a toolkit for efficient deep learning

D2Go D2Go is a production ready software system from FacebookResearch, which supports end-to-end model training and deployment for mobile platforms. W

Facebook Research 744 Jan 04, 2023
"Exploring Vision Transformers for Fine-grained Classification" at CVPRW FGVC8

FGVC8 Exploring Vision Transformers for Fine-grained Classification paper presented at the CVPR 2021, The Eight Workshop on Fine-Grained Visual Catego

Marcos V. Conde 19 Dec 06, 2022
Classic Papers for Beginners and Impact Scope for Authors.

There have been billions of academic papers around the world. However, maybe only 0.0...01% among them are valuable or are worth reading. Since our limited life has never been forever, TopPaper provi

Qiulin Zhang 228 Dec 18, 2022