Pytorch implementation of Decoupled Spatial-Temporal Transformer for Video Inpainting

Last update: Dec 13, 2022

Related tags

Deep Learning DSTT

Overview

Decoupled Spatial-Temporal Transformer for Video Inpainting

By Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu Sun, Xiaogang Wang, Jifeng Dai, Hongsheng Li.

This repo is the official Pytorch implementation of Decoupled Spatial-Temporal Transformer for Video Inpainting.

Introduction

Usage

Prerequisites

Python >= 3.6
Pytorch >= 1.0 and corresponding torchvision (https://pytorch.org/)

Install

Clone this repo:

git clone https://github.com/ruiliu-ai/DSTT.git

Install other packages:

cd DSTT
pip install -r requirements.txt

Training

Dataset preparation

Download datasets (YouTube-VOS and DAVIS) into the data folder.

mkdir data

Training script

python train.py -c configs/youtube-vos.json

Test

Download pre-trained model into checkpoints folder.

mkdir checkpoints

Test script

python test.py -c checkpoints/dstt.pth -v data/DAVIS/JPEGImages/blackswan -m data/DAVIS/Annotations/blackswan

Citing DSTT

If you find DSTT useful in your research, please consider citing:

@article{Liu_2021_DSTT,
  title={Decoupled Spatial-Temporal Transformer for Video Inpainting},
  author={Liu, Rui and Deng, Hanming and Huang, Yangyi and Shi, Xiaoyu and Lu, Lewei and Sun, Wenxiu and Wang, Xiaogang and Li Hongsheng},
  journal={arXiv preprint arXiv:2104.06637},
  year={2021}
}

Acknowledement

This code relies heavily on the video inpainting framework from spatial-temporal transformer net.

Pytorch implementation of Decoupled Spatial-Temporal Transformer for Video Inpainting

Related tags

Overview

Decoupled Spatial-Temporal Transformer for Video Inpainting

Introduction

Usage

Prerequisites

Install

Training

Dataset preparation

Training script

Test

Test script

Citing DSTT

Acknowledement

Owner

Bonnet: An Open-Source Training and Deployment Framework for Semantic Segmentation in Robotics.

Contains code for the paper "Vision Transformers are Robust Learners".

《Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

PointCNN: Convolution On X-Transformed Points (NeurIPS 2018)

Robot Hacking Manual (RHM). From robotics to cybersecurity. Papers, notes and writeups from a journey into robot cybersecurity.

Tools for computational pathology

Immortal tracker

Auxiliary Raw Net (ARawNet) is a ASVSpoof detection model taking both raw waveform and handcrafted features as inputs, to balance the trade-off between performance and model complexity.

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

A tool to visualise the results of AlphaFold2 and inspect the quality of structural predictions

An Straight Dilated Network with Wavelet for image Deblurring

Code for the TPAMI paper: "Syntax Customized Video Captioning by Imitating Exemplar Sentences"

Auto White-Balance Correction for Mixed-Illuminant Scenes

Code for "Adversarial Training for a Hybrid Approach to Aspect-Based Sentiment Analysis

Liecasadi - liecasadi implements Lie groups operation written in CasADi

Framework web SnakeServer.

Node Dependent Local Smoothing for Scalable Graph Learning

A sketch extractor for anime/illustration.

A Broader Picture of Random-walk Based Graph Embedding