Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

Last update: Dec 21, 2022

Related tags

Overview

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

This repo is the official implementation of "DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion"

by Peng Sun, Wenhu Zhang, Huanyu Wang, Songyuan Li, and Xi Li.

Prerequisites

Ubuntu 18
PyTorch 1.7.0
CUDA 10.1
Cudnn 7.5.1
Python 3.7
Numpy 1.17.3

Training

Please see launch_train.sh and launch_pretrain.sh for imagenet pretraining and sod training, respectively.

Testing

Please see launch_test.sh for testing on the sod benchmarks.

Main Results

Dataset	E_r	S_λ^mean	F_β^mean	M
DUT-RGBD	0.950	0.921	0.926	0.030
NJUD	0.923	0.903	0.901	0.039
NLPR	0.950	0.918	0.897	0.024
SSD	0.904	0.876	0.852	0.045
STEREO	0.933	0.904	0.898	0.036
LFSD	0.923	0.882	0.882	0.054
RGBD135	0.962	0.920	0.896	0.021

Saliency maps and Evaluation

All of the saliency maps mentioned in the paper are available on GoogleDrive or BaiduYun(code:juc2).

You can use the toolbox provided by jiwei0921 for evaluation.

Additionally, we also provide the saliency maps of the STERE-1000 and SIP dataset on BaiduYun(code:qxfw) for easy comparison.

Dataset	E_r	S_λ^mean	F_β^mean	M
STERE-1000	0.928	0.897	0.895	0.038
SIP	0.908	0.861	0.868	0.057

Citation

@inproceedings{Sun2021DeepRS,
  title={Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion},
  author={P. Sun and Wenhu Zhang and Huanyu Wang and Songyuan Li and Xi Li},
  journal={IEEE Conf. Comput. Vis. Pattern Recog.},
  year={2021}
}

License

The code is released under MIT License (see LICENSE file for details).

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

Related tags

Overview

DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

Prerequisites

Training

Testing

Main Results

Saliency maps and Evaluation

Citation

License

Owner

如今我已剑指天涯

SustainBench: Benchmarks for Monitoring the Sustainable Development Goals with Machine Learning

An implementation for the ICCV 2021 paper Deep Permutation Equivariant Structure from Motion.

Ontologysim: a Owlready2 library for applied production simulation

[ICCV' 21] "Unsupervised Point Cloud Pre-training via Occlusion Completion"

On-device wake word detection powered by deep learning.

Robustness via Cross-Domain Ensembles

A simplified framework and utilities for PyTorch

BraTs-VNet - BraTS(Brain Tumour Segmentation) using V-Net

Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

The official implementation of "Rethink Dilated Convolution for Real-time Semantic Segmentation"

Multiple Object Tracking with Yolov5!

Aiming at the common training datsets split, spectrum preprocessing, wavelength select and calibration models algorithm involved in the spectral analysis process

Perform Linear Classification with Multi-way Data

PyTorch implementation of PSPNet segmentation network

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

[ICLR 2021] Is Attention Better Than Matrix Decomposition?

We will release the code of "ConTNet: Why not use convolution and transformer at the same time?" in this repo

Code for KDD'20 "An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph"

Code release for Convolutional Two-Stream Network Fusion for Video Action Recognition