[CVPR'21] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

Overview

IVOS-W

Paper

Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

Zhaoyun Yin, Jia Zheng, Weixin Luo, Shenhan Qian, Hanling Zhang, Shenghua Gao.

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

[Preprint] [Supplementary Material] [Poster]

Getting Started

Create the environment

# create conda env
conda create -n ivosw python=3.7
# activate conda env
conda activate ivosw
# install pytorch
conda install pytorch=1.3 torchvision
# install other dependencies
pip install -r requirements.txt

We adopt MANet, IPN, and ATNet as the VOS algorithms. Please follow the instructions to install the dependencies.

git clone https://github.com/yuk6heo/IVOS-ATNet.git VOS/ATNet
git clone https://github.com/lightas/CVPR2020_MANet.git VOS/MANet
git clone https://github.com/zyy-cn/IPN.git VOS/IPN

Dataset Preparation

  • DAVIS 2017 Dataset
    • Download the data and human annotated scribbles here.
    • Place DAVIS folder into root/data.
  • YouTube-VOS Dataset
    • Download the YouTube-VOS 2018 version here.
    • Clean up the annotations following here.
    • Download our annotated scribbles here.

Create a DAVIS-like structure of YouTube-VOS by running the following commands:

python datasets/prepare_ytbvos.py --src path/to/youtube_vos --scb path/to/scribble_dir

Evaluation

For evaluation, please download the pretrained agent model and quality assessment model, then place them into root/weights and run the following commands:

python eval_agent_{atnet/manet/ipn}.py with setting={oracle/wild} dataset={davis/ytbvos} method={random/linspace/worst/ours}

The results will be stored in results/{VOS}/{setting}/{dataset}/{method}/summary.json

Note: The results may fluctuate slightly with different versions of networkx, which is used by davisinteractive to generate simulated scribbles.

Training

First, prepare the data used to train the agent by downloading reward records and pretrained experience buffer, place them into root/train, or generate them from scratch:

python produce_reward.py
python pretrain_agent.py

To train the agent:

python train_agent.py

To train the segmentation quality assessment model:

python generate_data.py
python quality_assessment.py

Citation

@inproceedings{IVOSW,
  title     = {Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild},
  author    = {Zhaoyuan Yin and
               Jia Zheng and
               Weixin Luo and
               Shenhan Qian and
               Hanling Zhang and
               Shenghua Gao},
  booktitle = {CVPR},
  year      = {2021}
}

LICENSE

The code is released under the MIT license.

Owner
SVIP Lab
ShanghaiTech Vision and Intelligent Perception Lab
SVIP Lab
Automatic differentiation with weighted finite-state transducers.

GTN: Automatic Differentiation with WFSTs Quickstart | Installation | Documentation What is GTN? GTN is a framework for automatic differentiation with

100 Dec 29, 2022
AutoML library for deep learning

Official Website: autokeras.com AutoKeras: An AutoML system based on Keras. It is developed by DATA Lab at Texas A&M University. The goal of AutoKeras

Keras 8.7k Jan 08, 2023
Node for thenewboston digital currency network.

Project setup For project setup see INSTALL.rst Community Join the community to stay updated on the most recent developments, project roadmaps, and ra

thenewboston 27 Jul 08, 2022
Housing Price Prediction

This project aim was to predict the price of houses in the Boston area during the great financial crisis through regression, as well as classify houses into different quality categories according to

Florian Klement 1 Jan 27, 2022
Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

ONNX Object Localization Network Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX. Ori

Ibai Gorordo 15 Oct 14, 2022
StyleGAN-Human: A Data-Centric Odyssey of Human Generation

StyleGAN-Human: A Data-Centric Odyssey of Human Generation Abstract: Unconditional human image generation is an important task in vision and graphics,

stylegan-human 762 Jan 08, 2023
This is the official PyTorch implementation of our paper: "Artistic Style Transfer with Internal-external Learning and Contrastive Learning".

Artistic Style Transfer with Internal-external Learning and Contrastive Learning This is the official PyTorch implementation of our paper: "Artistic S

51 Dec 20, 2022
UnpNet - Rethinking 3-D LiDAR Point Cloud Segmentation(IEEE TNNLS)

UnpNet Citation Please cite the following paper if you use this repository in your reseach. @article {PMID:34914599, Title = {Rethinking 3-D LiDAR Po

Shijie Li 4 Jul 15, 2022
IOT: Instance-wise Layer Reordering for Transformer Structures

Introduction This repository contains the code for Instance-wise Ordered Transformer (IOT), which is introduced in the ICLR2021 paper IOT: Instance-wi

IOT 19 Nov 15, 2022
Official git repo for the CHIRP project

CHIRP Project This is the official git repository for the CHIRP project. Pull requests are accepted here, but for the moment, the main repository is s

Dan Smith 77 Jan 08, 2023
Malware Analysis Neural Network project.

MalanaNeuralNetwork Description Malware Analysis Neural Network project. Table of Contents Getting Started Requirements Installation Clone Set-Up VENV

2 Nov 13, 2021
HackBMU-5.0-Team-Ctrl-Alt-Elite - HackBMU 5.0 Team Ctrl Alt Elite

HackBMU-5.0-Team-Ctrl-Alt-Elite The search is over. We present to you ‘Health-A-

3 Feb 19, 2022
PEPit is a package enabling computer-assisted worst-case analyses of first-order optimization methods.

PEPit: Performance Estimation in Python This open source Python library provides a generic way to use PEP framework in Python. Performance estimation

Baptiste 53 Nov 16, 2022
CC-GENERATOR - A python script for generating CC

CC-GENERATOR A python script for generating CC NOTE: This tool is for Educationa

Lêkzï 6 Oct 14, 2022
Open-source python package for the extraction of Radiomics features from 2D and 3D images and binary masks.

pyradiomics v3.0.1 Build Status Linux macOS Windows Radiomics feature extraction in Python This is an open-source python package for the extraction of

Artificial Intelligence in Medicine (AIM) Program 842 Dec 28, 2022
Scene-Text-Detection-and-Recognition (Pytorch)

Scene-Text-Detection-and-Recognition (Pytorch) Competition URL: https://tbrain.t

Gi-Luen Huang 9 Jan 02, 2023
This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks Project Page | Video | Presentation | Paper | Data L

Facebook Research 281 Dec 22, 2022
Unofficial PyTorch Implementation of "DOLG: Single-Stage Image Retrieval with Deep Orthogonal Fusion of Local and Global Features"

Pytorch Implementation of Deep Orthogonal Fusion of Local and Global Features (DOLG) This is the unofficial PyTorch Implementation of "DOLG: Single-St

DK 96 Jan 06, 2023
[TOG 2021] PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling.

This repository contains the official PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling. We propose a SofGAN image generator to decouple the latent space o

Anpei Chen 694 Dec 23, 2022
Official code for paper "Demystifying Local Vision Transformer: Sparse Connectivity, Weight Sharing, and Dynamic Weight"

Demysitifing Local Vision Transformer, arxiv This is the official PyTorch implementation of our paper. We simply replace local self attention by (dyna

138 Dec 28, 2022