The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

Last update: Nov 17, 2022

Related tags

Deep Learning GCoNet

Overview

GCoNet

The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

Trained model

Download final_gconet.pth (Google Drive). And it is the training log.

Put final_gconet.pth at GCoNet/tmp/GCoNet_run1.

Run test.sh for evaluation.

Data Format

Put the DUTS_class (training dataset from GICD), CoCA, CoSOD3k and Cosal2015 datasets to GCoNet/data as the following structure:

GCoNet
   ├── other codes
   ├── ...
   │ 
   └── data
         ├──── images
         |       ├── DUTS_class (DUTS_class's image files)
         |       ├── CoCA (CoCA's image files)
         |       ├── CoSOD3k (CoSOD3k's image files)
         │       └── Cosal2015 (Cosal2015's image files)
         │ 
         └────── gts
                  ├── DUTS_class (DUTS_class's Groundtruth files)
                  ├── CoCA (CoCA's Groundtruth files)
                  ├── CoSOD3k (CoSOD3k's Groundtruth files)
                  └── Cosal2015 (Cosal2015's Groundtruth files)

Usage

Run sh all.sh for training (train_GPU0.sh) and testing (test.sh).

Prediction results

The co-saliency maps of GCoNet can be found at Google Drive.

Note and Discussion

In your training, you can usually obtain slightly worse performance on CoCA dataset and slightly better perofmance on Cosal2015 and CoSOD3k datasets. The performance fluctuation is around 1.0 point for Cosal2015 and CoSOD3k datasets and around 2.0 points for CoCA dataset.

We observed that the results on CoCA dataset are unstable when train the model multiple times, and the performance fluctuation can reach around 1.5 ponits (But our performance are still much better than other methods in the worst case).
Therefore, we provide our used training pairs and sequences with deterministic data augmentation to help you to reproduce our results on CoCA. (In different machines, these inputs and data augmentation are different but deterministic.) However, there is still randomness in the training stage, and you can obtain different performance on CoCA.

There are three possible reasons:

It may be caused by the challenging images of CoCA dataset where the target objects are relative small and there are many non-target objects in a complex environment.
The imperfect training dataset. We use the training dataset in GICD, whose labels are produced by the classification model. There are some noisy labels in the training dataset.
The randomness of training groups. In our training, two groups are randomly picked for training. Different collaborative training groups have different training difficulty.

Possible research directions for performance stability:

Reduce label noise. If you want to use the training dataset in GICD to train your model. It is better to use multiple powerful classification models (ensemble) to obtain better class labels.
Deterministic training groups. For two collaborative image groups, you can explore different ways to pick the suitable groups, e.g., pick two most similar groups for hard example mining.

It is a potential research direction to obtain stable results on such challenging real-world images. We follow other CoSOD methods to report the best performance of our model. You need to train the model multiple times to obtain the best result on CoCA dataset. If you want more discussion about it, you can contact me ([email protected]).

Citation

@inproceedings{fan2021gconet,
title={Group Collaborative Learning for Co-Salient Object Detection},
author={Fan, Qi and Fan, Deng-Ping and Fu, Huazhu and Tang, Chi-Keung and Shao, Ling and Tai, Yu-Wing},
booktitle={CVPR},
year={2021}
}

Acknowledgements

Zhao Zhang gives us lots of helps! Our framework is built on his GICD.

The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

Related tags

Overview

GCoNet

Trained model

Data Format

Usage

Prediction results

Note and Discussion

Citation

Acknowledgements

Owner

Qi Fan

Deep deconfounded recommender (Deep-Deconf) for paper "Deep causal reasoning for recommendations"

GANmouflage: 3D Object Nondetection with Texture Fields

Tool for installing and updating MiSTer cores and other files

A PyTorch implementation of "Graph Wavelet Neural Network" (ICLR 2019)

Automatic Attendance marker for LMS Practice School Division, BITS Pilani

Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

joint detection and semantic segmentation, based on ultralytics/yolov5,

TransVTSpotter: End-to-end Video Text Spotter with Transformer

Simulation-based inference for the Galactic Center Excess

PyTorch-centric library for evaluating and enhancing the robustness of AI technologies

A unified 3D Transformer Pipeline for visual synthesis

[CVPR 2021] Region-aware Adaptive Instance Normalization for Image Harmonization

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Improving Non-autoregressive Generation with Mixup Training

A library for implementing Decentralized Graph Neural Network algorithms.

Self-supervised Product Quantization for Deep Unsupervised Image Retrieval - ICCV2021

This is the implementation of "SELF SUPERVISED REPRESENTATION LEARNING WITH DEEP CLUSTERING FOR ACOUSTIC UNIT DISCOVERY FROM RAW SPEECH" submitted to ICASSP 2022

Simple sinc interpolation in PyTorch.

Consumer Fairness in Recommender Systems: Contextualizing Definitions and Mitigations

For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.