Implementation of "Efficient Regional Memory Network for Video Object Segmentation" (Xie et al., CVPR 2021).

Related tags

Deep LearningRMNet
Overview

RMNet

This repository contains the source code for the paper Efficient Regional Memory Network for Video Object Segmentation.

Language grade: Python Total alerts

Overview

Cite this work

@inproceedings{xie2021efficient,
  title={Efficient Regional Memory Network for Video Object Segmentation},
  author={Xie, Haozhe and 
          Yao, Hongxun and 
          Zhou, Shangchen and 
          Zhang, Shengping and 
          Sun, Wenxiu},
  booktitle={CVPR},
  year={2021}
}

Datasets

We use the ECSSD, COCO, PASCAL VOC, MSRA10K, DAVIS, and YouTube-VOS datasets in our experiments, which are available below:

Pretrained Models

The pretrained models for DAVIS and YouTube-VOS are available as follows:

Prerequisites

Clone the Code Repository

git clone https://github.com/hzxie/RMNet.git

Install Python Denpendencies

cd RMNet
pip install -r requirements.txt

Build PyTorch Extensions

NOTE: PyTorch >= 1.4, CUDA >= 9.0 and GCC >= 4.9 are required.

RMNET_HOME=`pwd`

cd $RMNET_HOME/extensions/reg_att_map_generator
python setup.py install --user

cd $RMNET_HOME/extensions/flow_affine_transformation
python setup.py install --user

Precompute the Optical Flow

Update Settings in config.py

You need to update the file path of the datasets:

__C.DATASETS                                     = edict()
__C.DATASETS.DAVIS                               = edict()
__C.DATASETS.DAVIS.INDEXING_FILE_PATH            = './datasets/DAVIS.json'
__C.DATASETS.DAVIS.IMG_FILE_PATH                 = '/path/to/Datasets/DAVIS/JPEGImages/480p/%s/%05d.jpg'
__C.DATASETS.DAVIS.ANNOTATION_FILE_PATH          = '/path/to/Datasets/DAVIS/Annotations/480p/%s/%05d.png'
__C.DATASETS.DAVIS.OPTICAL_FLOW_FILE_PATH        = '/path/to/Datasets/DAVIS/OpticalFlows/480p/%s/%05d.flo'
__C.DATASETS.YOUTUBE_VOS                         = edict()
__C.DATASETS.YOUTUBE_VOS.INDEXING_FILE_PATH      = '/path/to/Datasets/YouTubeVOS/%s/meta.json'
__C.DATASETS.YOUTUBE_VOS.IMG_FILE_PATH           = '/path/to/Datasets/YouTubeVOS/%s/JPEGImages/%s/%s.jpg'
__C.DATASETS.YOUTUBE_VOS.ANNOTATION_FILE_PATH    = '/path/to/Datasets/YouTubeVOS/%s/Annotations/%s/%s.png'
__C.DATASETS.YOUTUBE_VOS.OPTICAL_FLOW_FILE_PATH  = '/path/to/Datasets/YouTubeVOS/%s/OpticalFlows/%s/%s.flo'
__C.DATASETS.PASCAL_VOC                          = edict()
__C.DATASETS.PASCAL_VOC.INDEXING_FILE_PATH       = '/path/to/Datasets/voc2012/trainval.txt'
__C.DATASETS.PASCAL_VOC.IMG_FILE_PATH            = '/path/to/Datasets/voc2012/images/%s.jpg'
__C.DATASETS.PASCAL_VOC.ANNOTATION_FILE_PATH     = '/path/to/Datasets/voc2012/masks/%s.png'
__C.DATASETS.ECSSD                               = edict()
__C.DATASETS.ECSSD.N_IMAGES                      = 1000
__C.DATASETS.ECSSD.IMG_FILE_PATH                 = '/path/to/Datasets/ecssd/images/%s.jpg'
__C.DATASETS.ECSSD.ANNOTATION_FILE_PATH          = '/path/to/Datasets/ecssd/masks/%s.png'
__C.DATASETS.MSRA10K                             = edict()
__C.DATASETS.MSRA10K.INDEXING_FILE_PATH          = './datasets/msra10k.txt'
__C.DATASETS.MSRA10K.IMG_FILE_PATH               = '/path/to/Datasets/msra10k/images/%s.jpg'
__C.DATASETS.MSRA10K.ANNOTATION_FILE_PATH        = '/path/to/Datasets/msra10k/masks/%s.png'
__C.DATASETS.MSCOCO                              = edict()
__C.DATASETS.MSCOCO.INDEXING_FILE_PATH           = './datasets/mscoco.txt'
__C.DATASETS.MSCOCO.IMG_FILE_PATH                = '/path/to/Datasets/coco2017/images/train2017/%s.jpg'
__C.DATASETS.MSCOCO.ANNOTATION_FILE_PATH         = '/path/to/Datasets/coco2017/masks/train2017/%s.png'
__C.DATASETS.ADE20K                              = edict()
__C.DATASETS.ADE20K.INDEXING_FILE_PATH           = './datasets/ade20k.txt'
__C.DATASETS.ADE20K.IMG_FILE_PATH                = '/path/to/Datasets/ADE20K_2016_07_26/images/training/%s.jpg'
__C.DATASETS.ADE20K.ANNOTATION_FILE_PATH         = '/path/to/Datasets/ADE20K_2016_07_26/images/training/%s_seg.png'

# Dataset Options: DAVIS, DAVIS_FRAMES, YOUTUBE_VOS, ECSSD, MSCOCO, PASCAL_VOC, MSRA10K, ADE20K
__C.DATASET.TRAIN_DATASET                        = ['ECSSD', 'PASCAL_VOC', 'MSRA10K', 'MSCOCO']  # Pretrain
__C.DATASET.TRAIN_DATASET                        = ['YOUTUBE_VOS', 'DAVISx5']                    # Fine-tune
__C.DATASET.TEST_DATASET                         = 'DAVIS'

# Network Options: RMNet, TinyFlowNet
__C.TRAIN.NETWORK                                = 'RMNet'

Get Started

To train RMNet, you can simply use the following command:

python3 runner.py

To test RMNet, you can use the following command:

python3 runner.py --test --weights=/path/to/pretrained/model.pth

License

This project is open sourced under MIT license.

Owner
Haozhe Xie
I am a Ph.D. candidate in Harbin Institute of Technology, focusing on 3D reconstruction, video segmentation, and computer vision.
Haozhe Xie
PartImageNet is a large, high-quality dataset with part segmentation annotations

PartImageNet: A Large, High-Quality Dataset of Parts We will release our dataset and scripts soon after cleaning and approval. Introduction PartImageN

Ju He 77 Nov 30, 2022
2 Jul 19, 2022
(Preprint) Official PyTorch implementation of "How Do Vision Transformers Work?"

(Preprint) Official PyTorch implementation of "How Do Vision Transformers Work?"

xxxnell 656 Dec 30, 2022
This is Official implementation for "Pose-guided Feature Disentangling for Occluded Person Re-Identification Based on Transformer" in AAAI2022

PFD:Pose-guided Feature Disentangling for Occluded Person Re-identification based on Transformer This repo is the official implementation of "Pose-gui

Tao Wang 93 Dec 18, 2022
Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training

SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervisionand Dynamic Self-Training Introduction This is a PyTorch implementation of "

weijiawu 34 Nov 09, 2022
Code for BMVC2021 "MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation"

MOS-Multi-Task-Face-Detect Introduction This repo is the official implementation of "MOS: A Low Latency and Lightweight Framework for Face Detection,

104 Dec 08, 2022
Simple ONNX operation generator. Simple Operation Generator for ONNX.

sog4onnx Simple ONNX operation generator. Simple Operation Generator for ONNX. https://github.com/PINTO0309/simple-onnx-processing-tools Key concept V

Katsuya Hyodo 6 May 15, 2022
The official PyTorch code implementation of "Personalized Trajectory Prediction via Distribution Discrimination" in ICCV 2021.

Personalized Trajectory Prediction via Distribution Discrimination (DisDis) The official PyTorch code implementation of "Personalized Trajectory Predi

25 Dec 20, 2022
UMich 500-Level Mobile Robotics Course

MOBILE ROBOTICS: METHODS & ALGORITHMS - WINTER 2022 University of Michigan - NA 568/EECS 568/ROB 530 For slides, lecture notes, and example codes, see

393 Dec 29, 2022
Supporting code for short YouTube series Neural Networks Demystified.

Neural Networks Demystified Supporting iPython notebooks for the YouTube Series Neural Networks Demystified. I've included formulas, code, and the tex

Stephen 1.3k Dec 23, 2022
Mengzi Pretrained Models

中文 | English Mengzi 尽管预训练语言模型在 NLP 的各个领域里得到了广泛的应用,但是其高昂的时间和算力成本依然是一个亟需解决的问题。这要求我们在一定的算力约束下,研发出各项指标更优的模型。 我们的目标不是追求更大的模型规模,而是轻量级但更强大,同时对部署和工业落地更友好的模型。

Langboat 424 Jan 04, 2023
Bayesian Optimization Library for Medical Image Segmentation.

bayesmedaug: Bayesian Optimization Library for Medical Image Segmentation. bayesmedaug optimizes your data augmentation hyperparameters for medical im

Şafak Bilici 7 Feb 10, 2022
Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe

Traductor de señas Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe Requerimientos 🔧 Python 3.8 o inferior para evitar

Jahaziel Hernandez Hoyos 3 Nov 12, 2022
A Free and Open Source Python Library for Multiobjective Optimization

Platypus What is Platypus? Platypus is a framework for evolutionary computing in Python with a focus on multiobjective evolutionary algorithms (MOEAs)

Project Platypus 424 Dec 18, 2022
Deep learning with TensorFlow and earth observation data.

Deep Learning with TensorFlow and EO Data Complete file set for Jupyter Book Autor: Development Seed Date: 04 October 2021 ISBN: (to come) Notebook tu

Development Seed 20 Nov 16, 2022
The GitHub repository for the paper: “Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction“.

SCINet This is the original PyTorch implementation of the following work: Time Series is a Special Sequence: Forecasting with Sample Convolution and I

386 Jan 01, 2023
Convolutional Neural Network for 3D meshes in PyTorch

MeshCNN in PyTorch SIGGRAPH 2019 [Paper] [Project Page] MeshCNN is a general-purpose deep neural network for 3D triangular meshes, which can be used f

Rana Hanocka 1.4k Jan 04, 2023
[ICCV2021] Learning to Track Objects from Unlabeled Videos

Unsupervised Single Object Tracking (USOT) 🌿 Learning to Track Objects from Unlabeled Videos Jilai Zheng, Chao Ma, Houwen Peng and Xiaokang Yang 2021

53 Dec 28, 2022
You Only Look One-level Feature (YOLOF), CVPR2021, Detectron2

You Only Look One-level Feature (YOLOF), CVPR2021 A simple, fast, and efficient object detector without FPN. This repo provides a neat implementation

qiang chen 273 Jan 03, 2023
The official implementation of CVPR 2021 Paper: Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation.

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation This repository is the official implementation of CVPR 2021 paper:

9 Nov 14, 2022