Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation.

Last update: Jan 01, 2023

Related tags

Overview

Training Script for Reuse-VOS

This code implementation of CVPR 2021 paper : Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation.

Hard case (Ours, FRTM)

(Ours)

(FRTM)

Easy case (Ours, FRTM)

(Ours)

(FRTM)

Requirement

python package

torch
python-opencv
skimage
easydict

GPU support

GPU Memory >= 11GB (RN18)
CUDA >= 10.0
pytorch >= 1.4.0

Datasets

DAVIS

To test the DAVIS validation split, download and unzip the 2017 480p trainval images and annotations here.

/path/DAVIS
|-- Annotations/
|-- ImageSets/
|-- JPEGImages/

YouTubeVOS

To test our validation split and the YouTubeVOS challenge 'valid' split, download YouTubeVOS 2018 and place it in this directory structure:

/path/ytvos2018
|-- train/
|-- train_all_frames/
|-- valid/
`-- valid_all_frames/

Release

DAVIS

model	Backbone	Training set	J & F 17	J & F 16	link
G-FRTM (t=1)	Resnet18	Youtube-VOS + DAVIS	71.7	80.9	Google Drive
G-FRTM (t=0.7)	Resnet18	Youtube-VOS + DAVIS	69.9	80.5	same pth
G-FRTM (t=1)	Resnet101	Youtube-VOS + DAVIS	76.4	84.3	Google Drive
G-FRTM (t=0.7)	Resnet101	Youtube-VOS + DAVIS	74.3	82.3	same pth

Youtube-VOS

model	Backbone	Training set	G	J-S	J-Us	F-S	F-Us	link
G-FRTM (t=1)	Resnet18	Youtube-VOS	63.8	68.3	55.2	70.6	61.0	Google Drive
G-FRTM (t=0.8)	Resnet18	Youtube-VOS	63.4	67.6	55.8	69.3	60.9	same pth
G-FRTM (t=0.7)	Resnet18	Youtube-VOS	62.7	67.1	55.2	68.2	60.1	same pth

We initialize orignal-FRTM layers from official FRTM repository weight for Youtube-VOS benchmark. S = Seen, Us = Unseen

Target model cache

Here is the cache file we used for ResNet18 file

Run

Train

Open train.py and adjust the paths dict to your dataset locations, checkpoint and tensorboard output directories and the place to cache target model weights.

To train a network, run following command.

python train.py --name <session-name> --ftext resnet18 --dset all --dev cuda:0

--name is the name of save_dir name of current train --ftext is the name of the feature extractor, either resnet18 or resnet101. --dset is one of dv2017, ytvos2018 or all ("all" really means "both"). --dev is the name of the device to train on. --m1 is the margin1 for training reuse gate, and we use 1.0 for DAVIS benchmark and 0.5 for Youtube-VOS benchmark. --m2 is the margin2 for training reuse gate, and we use 0.

Replace "session-name" with whatever you like. Subdirectories with this name will be created under your checkpoint and tensorboard paths.

Eval

Open eval.py and adjust the paths dict to your dataset locations, checkpoint and tensorboard output directories and the place to cache target model weights.

To train a network, run following command.

python evaluate.py --ftext resnet18 --dset dv2017val --dev cuda:0

--ftext is the name of the feature extractor, either resnet18 or resnet101. --dset is one of dv2016val, dv2017val, yt2018jjval, yt2018val or yt2018valAll --dev is the name of the device to eval on. --TH Threshold for tau default= 0.7

The inference results will be saved at ${ROOT}/${result} . It is better to check multiple pth file for good accuracy.

Acknowledgement

This codebase borrows the code and structure from official FRTM repository. We are grateful to Facebook Inc. with valuable discussions.

Reference

The codebase is built based on following works

@misc{park2020learning,
      title={Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation}, 
      author={Hyojin Park and Jayeon Yoo and Seohyeong Jeong and Ganesh Venkatesh and Nojun Kwak},
      year={2020},
      eprint={2012.11655},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation.

Related tags

Overview

Training Script for Reuse-VOS

Requirement

python package

GPU support

Datasets

DAVIS

YouTubeVOS

Release

DAVIS

Youtube-VOS

Target model cache

Run

Train

Eval

Acknowledgement

Reference

Owner

HYOJINPARK

CVPR2020 Counterfactual Samples Synthesizing for Robust VQA

Wenzhou-Kean University AI-LAB

Parameterising Simulated Annealing for the Travelling Salesman Problem

RL Algorithms with examples in Python / Pytorch / Unity ML agents

Utilities to bridge Canvas-generated course rosters with GitLab's API.

Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Python scripts for performing road segemtnation and car detection using the HybridNets multitask model in ONNX.

💡 Learnergy is a Python library for energy-based machine learning models.

Rendering Point Clouds with Compute Shaders

Pytorch ImageNet1k Loader with Bounding Boxes.

Cross-media Structured Common Space for Multimedia Event Extraction (ACL2020)

PyTorch source code for Distilling Knowledge by Mimicking Features

Dictionary Learning with Uniform Sparse Representations for Anomaly Detection

Algorithmic Trading using RNN

Air Quality Prediction Using LSTM

🦕 NanoSaur is a little tracked robot ROS2 enabled, made for an NVIDIA Jetson Nano

[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

CDGAN: Cyclic Discriminative Generative Adversarial Networks for Image-to-Image Transformation

hipCaffe: the HIP port of Caffe