[IEEE TPAMI21] MobileSal: Extremely Efficient RGB-D Salient Object Detection [PyTorch & Jittor]

Last update: Jan 06, 2023

Related tags

Overview

MobileSal

IEEE TPAMI 2021: MobileSal: Extremely Efficient RGB-D Salient Object Detection

This repository contains full training & testing code, and pretrained saliency maps. We have achieved competitive performance on the RGB-D salient object detection task with a speed of 450fps.

If you run into any problems or feel any difficulties to run this code, do not hesitate to leave issues in this repository.

My e-mail is: wuyuhuan @ mail.nankai (dot) edu.cn

[PDF]

Requirements

PyTorch

Python 3.6+
PyTorch >=0.4.1, OpenCV-Python
Tested on PyTorch 1.7.1

Jittor

Python 3.7+
Jittor, OpenCV-Python
Tested on Jittor 1.3.1

For Jittor users, we create a branch jittor. So please run the following command first:

git checkout jittor

Installing

Please prepare the required packages.

pip install -r envs/requirements.txt

Data Preparing

Before training/testing our network, please download the training data:

Preprocessed data of 6 datasets: [Google Drive], [Baidu Pan, 9nxi]

Note: if you are blocked by Google and Baidu services, you can contact me via e-mail and I will send you a copy of data and model weights.

We have processed the data to json format so you can use them without any preprocessing steps. After completion of downloading, extract the data and put them to ./data/ folder. Then, the ./datasets/ folder should contain six folders: NJU2K/, NLPR/, STERE/, SSD/, SIP/, DUT-RGBD/, representing NJU2K, NLPR, STEREO, SSD, SIP, DUTLF-D datasets, respectively.

Train

It is very simple to train our network. We have prepared a script to run the training step:

bash ./tools/train.sh

Pretrained Models

As in our paper, we train our model on the NJU2K_NLPR training set, and test our model on NJU2K_test, NLPR_test, STEREO, SIP, and SSD datasets. For DUTLF-D, we train our model on DUTLF-D training set and evaluate on its testing test.

(Default) Trained on NJU2K_NLPR training set:

Single-scale Training: [Google Drive], [Baidu Pan, 9nxi]
Multi-scale Training: [Google Drive], [Baidu Pan, 9nxi]

(Custom) Training on DUTLF-D training set:

Multi-scale Training: [Google Drive], [Baidu Pan, 9nxi]

Download them and put them into the pretrained/ folder.

Test / Evaluation / Results

After preparing the pretrained models, it is also very simple to test our network:

bash ./tools/test.sh

The scripts will automatically generate saliency maps on the maps/ directory.

Pretrained Saliency maps

For covenience, we provide the pretrained saliency maps on several datasets as below:

Single-scale Training: [Google Drive], [Baidu Pan, 9nxi]
Multi-scale Training: [Google Drive], [Baidu Pan, 9nxi]

TODO

Release the pretrained models and saliency maps on COME15K dataset.
Release the ONNX model for real-world applications.
Add results with the P2T transformer backbone.

Other Tips

I encourage everyone to contact me via my e-mail. My e-mail is: wuyuhuan @ mail.nankai (dot) edu.cn

License

The code is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License for NonCommercial use only.

Citations

If you are using the code/model/data provided here in a publication, please consider citing our work:

@ARTICLE{wu2021mobilesal,
  author={Wu, Yu-Huan and Liu, Yun and Xu, Jun and Bian, Jia-Wang and Gu, Yu-Chao and Cheng, Ming-Ming},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={MobileSal: Extremely Efficient RGB-D Salient Object Detection}, 
  year={2021},
  doi={10.1109/TPAMI.2021.3134684}
}

Acknowlogdement

This repository is built under the help of the following five projects for academic use only:

[IEEE TPAMI21] MobileSal: Extremely Efficient RGB-D Salient Object Detection [PyTorch & Jittor]

Related tags

Overview

MobileSal

Requirements

PyTorch

Jittor

Installing

Data Preparing

Train

Pretrained Models

Test / Evaluation / Results

Pretrained Saliency maps

TODO

Other Tips

License

Citations

Acknowlogdement

Owner

Yu-Huan Wu

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

Code for KDD'20 "An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph"

Using multidimensional LSTM neural networks to create a forecast for Bitcoin price

🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

Official PyTorch implementation of PICCOLO: Point-Cloud Centric Omnidirectional Localization (ICCV 2021)

Generating Videos with Scene Dynamics

This is the source code of the 1st place solution for segmentation task (with Dice 90.32%) in 2021 CCF BDCI challenge.

Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"

Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

A repository built on the Flow software package to explore cyber-security attacks on intelligent transportation systems.

Using Python to Play Cyberpunk 2077

Learning cell communication from spatial graphs of cells

Pytorch implementation of MalConv

Code for NeurIPS 2021 paper: Invariant Causal Imitation Learning for Generalizable Policies

Diverse Object-Scene Compositions For Zero-Shot Action Recognition

[ICLR 2021] HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark

An Object Oriented Programming (OOP) interface for Ontology Web language (OWL) ontologies.

SBINN: Systems-biology informed neural network

Final project code: Implementing MAE with downscaled encoders and datasets, for ESE546 FA21 at University of Pennsylvania

This is a Keras-based Python implementation of DeepMask- a complex deep neural network for learning object segmentation masks