Pytorch implementation of Decoupled Spatial-Temporal Transformer for Video Inpainting

Last update: Dec 13, 2022

Related tags

Deep Learning DSTT

Overview

Decoupled Spatial-Temporal Transformer for Video Inpainting

By Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu Sun, Xiaogang Wang, Jifeng Dai, Hongsheng Li.

This repo is the official Pytorch implementation of Decoupled Spatial-Temporal Transformer for Video Inpainting.

Introduction

Usage

Prerequisites

Python >= 3.6
Pytorch >= 1.0 and corresponding torchvision (https://pytorch.org/)

Install

Clone this repo:

git clone https://github.com/ruiliu-ai/DSTT.git

Install other packages:

cd DSTT
pip install -r requirements.txt

Training

Dataset preparation

Download datasets (YouTube-VOS and DAVIS) into the data folder.

mkdir data

Training script

python train.py -c configs/youtube-vos.json

Test

Download pre-trained model into checkpoints folder.

mkdir checkpoints

Test script

python test.py -c checkpoints/dstt.pth -v data/DAVIS/JPEGImages/blackswan -m data/DAVIS/Annotations/blackswan

Citing DSTT

If you find DSTT useful in your research, please consider citing:

@article{Liu_2021_DSTT,
  title={Decoupled Spatial-Temporal Transformer for Video Inpainting},
  author={Liu, Rui and Deng, Hanming and Huang, Yangyi and Shi, Xiaoyu and Lu, Lewei and Sun, Wenxiu and Wang, Xiaogang and Li Hongsheng},
  journal={arXiv preprint arXiv:2104.06637},
  year={2021}
}

Acknowledement

This code relies heavily on the video inpainting framework from spatial-temporal transformer net.

Pytorch implementation of Decoupled Spatial-Temporal Transformer for Video Inpainting

Related tags

Overview

Decoupled Spatial-Temporal Transformer for Video Inpainting

Introduction

Usage

Prerequisites

Install

Training

Dataset preparation

Training script

Test

Test script

Citing DSTT

Acknowledement

Owner

Python版OpenCVのTracking APIのサンプルです。DaSiamRPNアルゴリズムまで対応しています。

PyArmadillo: an alternative approach to linear algebra in Python

The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described in the paper.

Out-of-distribution detection using the pNML regret. NeurIPS2021

Revisiting Temporal Alignment for Video Restoration

A large-scale video dataset for the training and evaluation of 3D human pose estimation models

This program uses trial auth token of Azure Cognitive Services to do speech synthesis for you.

Repository providing a wide range of self-supervised pretrained models for computer vision tasks.

ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers

The Unsupervised Reinforcement Learning Benchmark (URLB)

A deep-learning pipeline for segmentation of ambiguous microscopic images.

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [2021]

Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression", TIP 2020

Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.

T2F: text to face generation using Deep Learning

TC-GNN with Pytorch integration

An image processing project uses Viola-jones technique to detect faces and then use SIFT algorithm for recognition.

Optimal space decomposition based-product quantization for approximate nearest neighbor search

TrackTech: Real-time tracking of subjects and objects on multiple cameras