source code of “Visual Saliency Transformer” (ICCV2021)

Related tags

Deep LearningVST
Overview

Visual Saliency Transformer (VST)

source code for our ICCV 2021 paper “Visual Saliency Transformer” by Nian Liu, Ni Zhang, Kaiyuan Wan, Junwei Han, and Ling Shao.

created by Ni Zhang, email: [email protected]

avatar

Requirement

  1. Pytorch 1.6.0
  2. Torchvison 0.7.0

RGB VST for RGB Salient Object Detection

Data Preparation

Training Set

We use the training set of DUTS to train our VST for RGB SOD. Besides, we follow Egnet to generate contour maps of DUTS trainset for training. You can directly download the generated contour maps DUTS-TR-Contour from [baidu pan fetch code: ow76 | Google drive] and put it into RGB_VST/Data folder.

Testing Set

We use the testing set of DUTS, ECSSD, HKU-IS, PASCAL-S, DUT-O, and SOD to test our VST. After Downloading, put them into RGB_VST/Data folder.

Your RGB_VST/Data folder should look like this:

-- Data
   |-- DUTS
   |   |-- DUTS-TR
   |   |-- | DUTS-TR-Image
   |   |-- | DUTS-TR-Mask
   |   |-- | DUTS-TR-Contour
   |   |-- DUTS-TE
   |   |-- | DUTS-TE-Image
   |   |-- | DUTS-TE-Mask
   |-- ECSSD
   |   |--images
   |   |--GT
   ...

Training, Testing, and Evaluation

  1. cd RGB_VST
  2. Download the pretrained T2T-ViT_t-14 model [baidu pan fetch code: 2u34 | Google drive] and put it into pretrained_model/ folder.
  3. Run python train_test_eval.py --Training True --Testing True --Evaluation True for training, testing, and evaluation. The predictions will be in preds/ folder and the evaluation results will be in result.txt file.

Testing on Our Pretrained RGB VST Model

  1. cd RGB_VST
  2. Download our pretrained RGB_VST.pth[baidu pan fetch code: pe54 | Google drive] and then put it in checkpoint/ folder.
  3. Run python train_test_eval.py --Testing True --Evaluation True for testing and evaluation. The predictions will be in preds/ folder and the evaluation results will be in result.txt file.

Our saliency maps can be downloaded from [baidu pan fetch code: 92t0 | Google drive].

SOTA Saliency Maps for Comparison

The saliency maps of the state-of-the-art methods in our paper can be downloaded from [baidu pan fetch code: de4k | Google drive].

RGB-D VST for RGB-D Salient Object Detection

Data Preparation

Training Set

We use 1,485 images from NJUD, 700 images from NLPR, and 800 images from DUTLF-Depth to train our VST for RGB-D SOD. Besides, we follow Egnet to generate corresponding contour maps for training. You can directly download the whole training set from here [baidu pan fetch code: 7vsw | Google drive] and put it into RGBD_VST/Data folder.

Testing Set

NJUD [baidu pan fetch code: 7mrn | Google drive]
NLPR [baidu pan fetch code: tqqm | Google drive]
DUTLF-Depth [baidu pan fetch code: 9jac | Google drive]
STERE [baidu pan fetch code: 93hl | Google drive]
LFSD [baidu pan fetch code: l2g4 | Google drive]
RGBD135 [baidu pan fetch code: apzb | Google drive]
SSD [baidu pan fetch code: j3v0 | Google drive]
SIP [baidu pan fetch code: q0j5 | Google drive]
ReDWeb-S

After Downloading, put them into RGBD_VST/Data folder.

Your RGBD_VST/Data folder should look like this:

-- Data
   |-- NJUD
   |   |-- trainset
   |   |-- | RGB
   |   |-- | depth
   |   |-- | GT
   |   |-- | contour
   |   |-- testset
   |   |-- | RGB
   |   |-- | depth
   |   |-- | GT
   |-- STERE
   |   |-- RGB
   |   |-- depth
   |   |-- GT
   ...

Training, Testing, and Evaluation

  1. cd RGBD_VST
  2. Download the pretrained T2T-ViT_t-14 model [baidu pan fetch code: 2u34 | Google drive] and put it into pretrained_model/ folder.
  3. Run python train_test_eval.py --Training True --Testing True --Evaluation True for training, testing, and evaluation. The predictions will be in preds/ folder and the evaluation results will be in result.txt file.

Testing on Our Pretrained RGB-D VST Model

  1. cd RGBD_VST
  2. Download our pretrained RGBD_VST.pth[baidu pan fetch code: zt0v | Google drive] and then put it in checkpoint/ folder.
  3. Run python train_test_eval.py --Testing True --Evaluation True for testing and evaluation. The predictions will be in preds/ folder and the evaluation results will be in result.txt file.

Our saliency maps can be downloaded from [baidu pan fetch code: jovk | Google drive].

SOTA Saliency Maps for Comparison

The saliency maps of the state-of-the-art methods in our paper can be downloaded from [baidu pan fetch code: i1we | Google drive].

Acknowledgement

We thank the authors of Egnet for providing codes of generating contour maps. We also thank Zhao Zhang for providing the efficient evaluation tool.

Citation

If you think our work is helpful, please cite

@inproceedings{liu2021VST, 
  title={Visual Saliency Transformer}, 
  author={Liu, Nian and Zhang, Ni and Han, Junwei and Shao, Ling},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2021}
}
Owner
Ni Zhang PhD student
Activity tragle - Google is tracking everything, we just look at it

activity_tragle Google is tracking everything, we just look at it here. You need

BERNARD Guillaume 1 Feb 15, 2022
Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...

Automatic, Readable, Reusable, Extendable Machin is a reinforcement library designed for pytorch. Build status Platform Status Linux Windows Supported

Iffi 348 Dec 24, 2022
This package implements the algorithms introduced in Smucler, Sapienza, and Rotnitzky (2020) to compute optimal adjustment sets in causal graphical models.

optimaladj: A library for computing optimal adjustment sets in causal graphical models This package implements the algorithms introduced in Smucler, S

Facundo Sapienza 6 Aug 04, 2022
Creative Applications of Deep Learning w/ Tensorflow

Creative Applications of Deep Learning w/ Tensorflow This repository contains lecture transcripts and homework assignments as Jupyter Notebooks for th

Parag K Mital 1.5k Dec 30, 2022
Continuous Query Decomposition for Complex Query Answering in Incomplete Knowledge Graphs

Continuous Query Decomposition This repository contains the official implementation for our ICLR 2021 (Oral) paper, Complex Query Answering with Neura

UCL Natural Language Processing 71 Dec 29, 2022
"Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation

Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices This repository contains the official PyTorch implemen

Yandex Research 21 Oct 18, 2022
This is a vision-based 3d model manipulation and control UI

Manipulation of 3D Models Using Hand Gesture This program allows user to manipulation 3D models (.obj format) with their hands. The project support bo

Cortic Technology Corp. 43 Oct 23, 2022
This repository contains code for the paper "Disentangling Label Distribution for Long-tailed Visual Recognition", published at CVPR' 2021

Disentangling Label Distribution for Long-tailed Visual Recognition (CVPR 2021) Arxiv link Blog post This codebase is built on Causal Norm. Install co

Hyperconnect 85 Oct 18, 2022
SeqTR: A Simple yet Universal Network for Visual Grounding

SeqTR This is the official implementation of SeqTR: A Simple yet Universal Network for Visual Grounding, which simplifies and unifies the modelling fo

seanZhuh 76 Dec 24, 2022
Implementation of Bagging and AdaBoost Algorithm

Bagging-and-AdaBoost Implementation of Bagging and AdaBoost Algorithm Dataset Red Wine Quality Data Sets For simplicity, we will have 2 classes of win

Zechen Ma 1 Nov 01, 2021
A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval

CLIP4CMR A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval The original data and pre-calculate

24 Dec 26, 2022
City-seeds - A random generator of cultural characteristics intended to spark ideas and help draw threads

City Seeds This is a random generator of cultural characteristics intended to sp

Aydin O'Leary 2 Mar 12, 2022
An example of Scatterbrain implementation (combining local attention and Performer)

An example of Scatterbrain implementation (combining local attention and Performer)

HazyResearch 97 Jan 02, 2023
RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into tables through jointly extracting intervention, outcome and outcome measure entities and their relations.

Randomised controlled trial abstract result tabulator RCT-ART is an NLP pipeline built with spaCy for converting clinical trial result sentences into

2 Sep 16, 2022
Official implementation of the PICASO: Permutation-Invariant Cascaded Attentional Set Operator

PICASO Official PyTorch implemetation for the paper PICASO:Permutation-Invariant Cascaded Attentive Set Operator. Requirements Python 3 torch = 1.0 n

Samira Zare 0 Dec 23, 2021
PyTorch implementation of Off-policy Learning in Two-stage Recommender Systems

Off-Policy-2-Stage This repo provides a PyTorch implementation of the MovieLens experiments for the following paper: Off-policy Learning in Two-stage

Jiaqi Ma 25 Dec 12, 2022
EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections Ruiqi Zhong, Kristy Lee*, Zheng Zhang*, Dan Klein EMN

Ruiqi Zhong 42 Nov 03, 2022
Code for "Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans" CVPR 2021 best paper candidate

News 05/17/2021 To make the comparison on ZJU-MoCap easier, we save quantitative and qualitative results of other methods at here, including Neural Vo

ZJU3DV 748 Jan 07, 2023
LeafSnap replicated using deep neural networks to test accuracy compared to traditional computer vision methods.

Deep-Leafsnap Convolutional Neural Networks have become largely popular in image tasks such as image classification recently largely due to to Krizhev

Sujith Vishwajith 48 Nov 27, 2022