Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)

Last update: Dec 14, 2022

Related tags

Deep Learning HMMN

Overview

Hierarchical Memory Matching Network for Video Object Segmentation

Hongje Seong, Seoung Wug Oh, Joon-Young Lee, Seongwon Lee, Suhyeon Lee, Euntai Kim

ICCV 2021

This is the implementation of HMMN.
This code is based on STM (ICCV 2019): [link].
Please see our paper for the details: [paper]

Dependencies

Python 3.8
PyTorch 1.8.1
numpy, opencv, pillow

Trained model

Download pre-trained weights into the same folder with demo scripts
Link: [weights]

Code

DAVIS-2016 validation set (Single-object)

python eval_DAVIS.py -g '0' -s val -y 16 -D [path/to/DAVIS]

DAVIS-2017 validation set (Multi-object)

python eval_DAVIS.py -g '0' -s val -y 17 -D [path/to/DAVIS]

Pre-computed Results

We also provide pre-computed results for benchmark sets.

Bibtex

@inproceedings{seong2021hierarchical,
  title={Hierarchical Memory Matching Network for Video Object Segmentation},
  author={Seong, Hongje and Oh, Seoung Wug and Lee, Joon-Young and Lee, Seongwon and Lee, Suhyeon and Kim, Euntai},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2021}
}

Terms of Use

This software is for non-commercial use only. The source code is released under the Attribution-NonCommercial-ShareAlike (CC BY-NC-SA) Licence (see this for details)

Hierarchical Memory Matching Network for Video Object Segmentation (ICCV 2021)

Related tags

Overview

Hierarchical Memory Matching Network for Video Object Segmentation

Hongje Seong, Seoung Wug Oh, Joon-Young Lee, Seongwon Lee, Suhyeon Lee, Euntai Kim

ICCV 2021

Dependencies

Trained model

Code

Pre-computed Results

Bibtex

Terms of Use

Owner

Hongje Seong

Python binding for Khiva library.

DeepStruc is a Conditional Variational Autoencoder which can predict the mono-metallic nanoparticle from a Pair Distribution Function.

Gesture Volume Control Using OpenCV and MediaPipe

It is a simple library to speed up CLIP inference up to 3x (K80 GPU)

Moer Grounded Image Captioning by Distilling Image-Text Matching Model

Real-Time Social Distance Monitoring tool using Computer Vision

MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks

A PaddlePaddle implementation of STGCN with a few modifications in the model architecture in order to forecast traffic jam.

Project dự đoán giá cổ phiếu bằng thuật toán LSTM gồm: code train và code demo

Software for Multimodalty 2D+3D Facial Expression Recognition (FER) UI

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

A machine learning library for spiking neural networks. Supports training with both torch and jax pipelines, and deployment to neuromorphic hardware.

Why Are You Weird? Infusing Interpretability in Isolation Forest for Anomaly Detection

Direct LiDAR Odometry: Fast Localization with Dense Point Clouds

Some bravo or inspiring research works on the topic of curriculum learning.

Interactive Image Generation via Generative Adversarial Networks

Code for "The Intrinsic Dimension of Images and Its Impact on Learning" - ICLR 2021 Spotlight

ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers

CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data