[PAMI 2020] Show, Match and Segment: Joint Weakly Supervised Learning of Semantic Matching and Object Co-segmentation

Last update: Nov 25, 2022

Related tags

Overview

Show, Match and Segment: Joint Weakly Supervised Learning of Semantic Matching and Object Co-segmentation

This repository contains the source code for the paper Show, Match and Segment: Joint Weakly Supervised Learning of Semantic Matching and Object Co-segmentation.

Abstract

We present an approach for jointly matching and segmenting object instances of the same category within a collection of images. In contrast to existing algorithms that tackle the tasks of semantic matching and object co-segmentation in isolation, our method exploits the complementary nature of the two tasks. The key insights of our method are two-fold. First, the estimated dense correspondence fields from semantic matching provide supervision for object co-segmentation by enforcing consistency between the predicted masks from a pair of images. Second, the predicted object masks from object co-segmentation in turn allow us to reduce the adverse effects due to background clutters for improving semantic matching. Our model is end-to-end trainable and does not require supervision from manually annotated correspondences and object masks. We validate the efficacy of our approach on five benchmark datasets: TSS, Internet, PF-PASCAL, PF-WILLOW, and SPair-71k, and show that our algorithm performs favorably against the state-of-the-art methods on both semantic matching and object co-segmentation tasks.

Citation

If you find our code useful, please consider citing our work using the following bibtex:

@article{MaCoSNet,
    title={Show, Match and Segment: Joint Weakly Supervised Learning of Semantic Matching and Object Co-segmentation},
    author={Chen, Yun-Chun and Lin, Yen-Yu and Yang, Ming-Hsuan and Huang, Jia-Bin},
    journal={IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)},
    year={2020}
}

@inproceedings{WeakMatchNet,
  title={Deep Semantic Matching with Foreground Detection and Cycle-Consistency},
  author={Chen, Yun-Chun and Huang, Po-Hsiang and Yu, Li-Yu and Huang, Jia-Bin and Yang, Ming-Hsuan and Lin, Yen-Yu},
  booktitle={Asian Conference on Computer Vision (ACCV)},
  year={2018}
}

Environment

Install Anaconda Python3.7
This code is tested on NVIDIA V100 GPU with 16GB memory

pip install -r requirements.txt

Dataset

Please download the PF-PASCAL, PF-WILLOW, SPair-71k, TSS, and Internet datasets
Please modify the variable DATASET_DIR in config.py
Please modify the variable CSV_DIR in config.py

Training

You may determine which dataset to be the training set by changing the $DATASET variable in train.sh
You may change the $BATCH_SIZE variable in train.sh to a suitable value based on the GPU memory
The trained model will be saved under the trained_models folder

sh train.sh

Evaluation

You may determine which dataset to be evaluated by changing the $DATASET variable in eval.sh
You may change the $BATCH_SIZE variable in eval.sh to a suitable value based on the GPU memory

sh eval.sh

Acknowledgement

This code is heavily borrowed from Rocco et al.

[PAMI 2020] Show, Match and Segment: Joint Weakly Supervised Learning of Semantic Matching and Object Co-segmentation

Related tags

Overview

Show, Match and Segment: Joint Weakly Supervised Learning of Semantic Matching and Object Co-segmentation

Abstract

Citation

Environment

Dataset

Training

Evaluation

Acknowledgement

Owner

Yun-Chun Chen

This repo provides code for QB-Norm (Cross Modal Retrieval with Querybank Normalisation)

Direct design of biquad filter cascades with deep learning by sampling random polynomials.

Train a deep learning net with OpenStreetMap features and satellite imagery.

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

TalkingHead-1KH is a talking-head dataset consisting of YouTube videos

Python Wrapper for Embree

A GUI to automatically create a TOPAS-readable MLC simulation file

DeepHyper: Scalable Asynchronous Neural Architecture and Hyperparameter Search for Deep Neural Networks

Video2x - A lossless video/GIF/image upscaler achieved with waifu2x, Anime4K, SRMD and RealSR.

Machine learning, in numpy

Flow is a computational framework for deep RL and control experiments for traffic microsimulation.

Web service for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation based on OpenFace 2.0

[ICCV2021] Official Pytorch implementation for SDGZSL (Semantics Disentangling for Generalized Zero-Shot Learning)

Script that receives an Image (original) and a set of images to be used as "pixels" in reconstruction of the Original image using the set of images as "pixels"

(NeurIPS 2021) Realistic Evaluation of Transductive Few-Shot Learning

Measures input lag without dedicated hardware, performing motion detection on recorded or live video

Differentiable architecture search for convolutional and recurrent networks

[CVPR 2022 Oral] Rethinking Minimal Sufficient Representation in Contrastive Learning

This is the PyTorch implementation of GANs N’ Roses: Stable, Controllable, Diverse Image to Image Translation

The code for "Deep Level Set for Box-supervised Instance Segmentation in Aerial Images".