Referring Video Object Segmentation

Overview

Awesome-Referring-Video-Object-Segmentation Awesome

Welcome to starts ⭐ & comments πŸ’Ή & sharing πŸ˜€ !!

- 2021.12.12: Recent papers (from 2021) 
- welcome to add if any information misses. 😎

Introduction

image

Referring video object segmentation aims at segmenting an object in video with language expressions.

Unlike the previous video object segmentation, the task exploits a different type of supervision, language expressions, to identify and segment an object referred by the given language expressions in a video. A detailed explanation of the new task can be found in the following paper.

Seonguk Seo, Joon-Young Lee, Bohyung Han, β€œURVOS: Unified Referring Video Object Segmentation Network with a Large-Scale Benchmark”, European Conference on Computer Vision (ECCV), 2020:https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123600205.pdf

Impressive Works Related to Referring Video Object Segmentation (RVOS)

Cross-modal progressive comprehension for referring segmentation:https://arxiv.org/abs/2105.07175 image

Benchmark

The 3rd Large-scale Video Object Segmentation - Track 3: Referring Video Object Segmentation

Datasets

image

Refer-YouTube-VOS-datasets

  • YouTube-VOS:
wget https://github.com/JerryX1110/awesome-rvos/blob/main/down_YTVOS_w_refer.py
python down_YTVOS_w_refer.py

Folder structure:

${current_path}/
└── refer_youtube_vos/ 
    β”œβ”€β”€ train/
    β”‚   β”œβ”€β”€ JPEGImages/
    β”‚   β”‚   └── */ (video folders)
    β”‚   β”‚       └── *.jpg (frame image files) 
    β”‚   └── Annotations/
    β”‚       └── */ (video folders)
    β”‚           └── *.png (mask annotation files) 
    β”œβ”€β”€ valid/
    β”‚   └── JPEGImages/
    β”‚       └── */ (video folders)
    β”‚           └── *.jpg (frame image files) 
    └── meta_expressions/
        β”œβ”€β”€ train/
        β”‚   └── meta_expressions.json  (text annotations)
        └── valid/
            └── meta_expressions.json  (text annotations)
  • A2D-Sentences:

REPO:https://web.eecs.umich.edu/~jjcorso/r/a2d/

paper:https://arxiv.org/abs/1803.07485

image

Citation:

@misc{gavrilyuk2018actor,
      title={Actor and Action Video Segmentation from a Sentence}, 
      author={Kirill Gavrilyuk and Amir Ghodrati and Zhenyang Li and Cees G. M. Snoek},
      year={2018},
      eprint={1803.07485},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

License: The dataset may not be republished in any form without the written consent of the authors.

README Dataset and Annotation (version 1.0, 1.9GB, tar.bz) Evaluation Toolkit (version 1.0, tar.bz)

mkdir a2d_sentences
cd a2d_sentences
wget https://web.eecs.umich.edu/~jjcorso/bigshare/A2D_main_1_0.tar.bz
tar jxvf A2D_main_1_0.tar.bz
mkdir text_annotations

cd text_annotations
wget https://kgavrilyuk.github.io/actor_action/a2d_annotation.txt
wget https://kgavrilyuk.github.io/actor_action/a2d_missed_videos.txt
wget https://github.com/JerryX1110/awesome-rvos/blob/main/down_a2d_annotation_with_instances.py
python down_a2d_annotation_with_instances.py
unzip a2d_annotation_with_instances.zip
#rm a2d_annotation_with_instances.zip
cd ..

cd ..

Folder structure:

${current_path}/
└── a2d_sentences/ 
    β”œβ”€β”€ Release/
    β”‚   β”œβ”€β”€ videoset.csv  (videos metadata file)
    β”‚   └── CLIPS320/
    β”‚       └── *.mp4     (video files)
    └── text_annotations/
        β”œβ”€β”€ a2d_annotation.txt  (actual text annotations)
        β”œβ”€β”€ a2d_missed_videos.txt
        └── a2d_annotation_with_instances/ 
            └── */ (video folders)
                └── *.h5 (annotations files) 

Citation:

@inproceedings{YaXuCaCVPR2017,
  author = {Yan, Y. and Xu, C. and Cai, D. and {\bf Corso}, {\bf J. J.}},
  booktitle = {{Proceedings of IEEE Conference on Computer Vision and Pattern Recognition}},
  tags = {computer vision, activity recognition, video understanding, semantic segmentation},
  title = {Weakly Supervised Actor-Action Segmentation via Robust Multi-Task Ranking},
  year = {2017}
}
@inproceedings{XuCoCVPR2016,
  author = {Xu, C. and {\bf Corso}, {\bf J. J.}},
  booktitle = {{Proceedings of IEEE Conference on Computer Vision and Pattern Recognition}},
  datadownload = {http://web.eecs.umich.edu/~jjcorso/r/a2d},
  tags = {computer vision, activity recognition, video understanding, semantic segmentation},
  title = {Actor-Action Semantic Segmentation with Grouping-Process Models},
  year = {2016}
}
@inproceedings{XuHsXiCVPR2015,
  author = {Xu, C. and Hsieh, S.-H. and Xiong, C. and {\bf Corso}, {\bf J. J.}},
  booktitle = {{Proceedings of IEEE Conference on Computer Vision and Pattern Recognition}},
  datadownload = {http://web.eecs.umich.edu/~jjcorso/r/a2d},
  poster = {http://web.eecs.umich.edu/~jjcorso/pubs/xu_corso_CVPR2015_A2D_poster.pdf},
  tags = {computer vision, activity recognition, video understanding, semantic segmentation},
  title = {Can Humans Fly? {Action} Understanding with Multiple Classes of Actors},
  url = {http://web.eecs.umich.edu/~jjcorso/pubs/xu_corso_CVPR2015_A2D.pdf},
  year = {2015}
}

image

downloading_script

mkdir jhmdb_sentences
cd jhmdb_sentences
wget http://files.is.tue.mpg.de/jhmdb/Rename_Images.tar.gz
wget https://kgavrilyuk.github.io/actor_action/jhmdb_annotation.txt
wget http://files.is.tue.mpg.de/jhmdb/puppet_mask.zip
tar -xzvf  Rename_Images.tar.gz
unzip puppet_mask.zip
cd ..

Folder structure:

${current_path}/
└── jhmdb_sentences/ 
    β”œβ”€β”€ Rename_Images/  (frame images)
    β”‚   └── */ (action dirs)
    β”œβ”€β”€ puppet_mask/  (mask annotations)
    β”‚   └── */ (action dirs)
    └── jhmdb_annotation.txt  (text annotations)

Citation:

@inproceedings{Jhuang:ICCV:2013,
title = {Towards understanding action recognition},
author = {H. Jhuang and J. Gall and S. Zuffi and C. Schmid and M. J. Black},
booktitle = {International Conf. on Computer Vision (ICCV)},
month = Dec,
pages = {3192-3199},
year = {2013}
}

image image image

Owner
Explorer
Explorer
Implementation for the IJCAI2021 work "Beyond the Spectrum: Detecting Deepfakes via Re-synthesis"

Beyond the Spectrum Implementation for the IJCAI2021 work "Beyond the Spectrum: Detecting Deepfakes via Re-synthesis" by Yang He, Ning Yu, Margret Keu

Yang He 27 Jan 07, 2023
Original code for "Zero-Shot Domain Adaptation with a Physics Prior"

Zero-Shot Domain Adaptation with a Physics Prior [arXiv] [sup. material] - ICCV 2021 Oral paper, by Attila Lengyel, Sourav Garg, Michael Milford and J

Attila Lengyel 40 Dec 21, 2022
CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image.

CoReNet CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image. It produces coherent reconstructions, where all objec

Google Research 80 Dec 25, 2022
More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdh

Ayan Kumar Bhunia 22 Aug 27, 2022
The code for Bi-Mix: Bidirectional Mixing for Domain Adaptive Nighttime Semantic Segmentation

BiMix The code for Bi-Mix: Bidirectional Mixing for Domain Adaptive Nighttime Semantic Segmentation arxiv Framework: visualization results: Requiremen

stanley 18 Sep 18, 2022
Explainable Zero-Shot Topic Extraction

Zero-Shot Topic Extraction with Common-Sense Knowledge Graph This repository contains the code for reproducing the results reported in the paper "Expl

D2K Lab 56 Dec 14, 2022
Learning Modified Indicator Functions for Surface Reconstruction

Learning Modified Indicator Functions for Surface Reconstruction In this work, we propose a learning-based approach for implicit surface reconstructio

4 Apr 18, 2022
Clustering with variational Bayes and population Monte Carlo

pypmc pypmc is a python package focusing on adaptive importance sampling. It can be used for integration and sampling from a user-defined target densi

45 Feb 06, 2022
Telegram chatbot created with deep learning model (LSTM) and telebot library.

Telegram chatbot Telegram chatbot created with deep learning model (LSTM) and telebot library. Description This program will allow you to create very

1 Jan 04, 2022
Noise Conditional Score Networks (NeurIPS 2019, Oral)

Generative Modeling by Estimating Gradients of the Data Distribution This repo contains the official implementation for the NeurIPS 2019 paper Generat

451 Dec 26, 2022
Locally Constrained Self-Attentive Sequential Recommendation

LOCKER This is the pytorch implementation of this paper: Locally Constrained Self-Attentive Sequential Recommendation. Zhankui He, Handong Zhao, Zhe L

Zhankui (Aaron) He 8 Jul 30, 2022
Multi-Content GAN for Few-Shot Font Style Transfer at CVPR 2018

MC-GAN in PyTorch This is the implementation of the Multi-Content GAN for Few-Shot Font Style Transfer. The code was written by Samaneh Azadi. If you

Samaneh Azadi 422 Dec 04, 2022
This repository collects 100 papers related to negative sampling methods.

Negative-Sampling-Paper This repository collects 100 papers related to negative sampling methods, covering multiple research fields such as Recommenda

RUCAIBox 119 Dec 29, 2022
Code for the paper "Query Embedding on Hyper-relational Knowledge Graphs"

Query Embedding on Hyper-Relational Knowledge Graphs This repository contains the code used for the experiments in the paper Query Embedding on Hyper-

DimitrisAlivas 19 Jul 26, 2022
Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-Learning

Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-Learning This is the official repository of "Camera Distortion-

Hanbyel Cho 12 Oct 06, 2022
In the AI for TSP competition we try to solve optimization problems using machine learning.

AI for TSP Competition Goal In the AI for TSP competition we try to solve optimization problems using machine learning. The competition will be hosted

Paulo da Costa 11 Nov 27, 2022
Deep Learning to Improve Breast Cancer Detection on Screening Mammography

Shield: This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. Deep Learning to Improve Breast

Li Shen 305 Jan 03, 2023
Rohit Ingole 2 Mar 24, 2022
ALBERT-pytorch-implementation - ALBERT pytorch implementation

ALBERT-pytorch-implementation developing... λͺ¨λΈμ˜ κ°œλ…μ΄ν•΄λ₯Ό 돕기 μœ„ν•œ κ΅¬ν˜„λ¬Όλ‘œ ν˜„μž¬ λ³€μˆ˜λͺ…을 μƒμ„Ένžˆ μ μ—ˆκ³ 

BG Kim 3 Oct 06, 2022
Codes and pretrained weights for winning submission of 2021 Brain Tumor Segmentation (BraTS) Challenge

Winning submission to the 2021 Brain Tumor Segmentation Challenge This repo contains the codes and pretrained weights for the winning submission to th

94 Dec 28, 2022