an implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation using PyTorch

Last update: Dec 22, 2022

Overview

revisiting-sepconv

This is a reference implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation [1] using PyTorch. Given two frames, it will make use of adaptive convolution [2] in a separable manner [3] to interpolate the intermediate frame. Should you be making use of our work, please cite our paper [1].

For the original SepConv, see: https://github.com/sniklaus/sepconv-slomo
For softmax splatting, please see: https://github.com/sniklaus/softmax-splatting

setup

The separable convolution layer is implemented in CUDA using CuPy, which is why CuPy is a required dependency. It can be installed using pip install cupy or alternatively using one of the provided binary packages as outlined in the CuPy repository.

If you plan to process videos, then please also make sure to have pip install moviepy installed.

usage

To run it on your own pair of frames, use the following command.

python run.py --model paper --one ./images/one.png --two ./images/two.png --out ./out.png

To run in on a video, use the following command.

python run.py --model paper --video ./videos/car-turn.mp4 --out ./out.mp4

For a quick benchmark using examples from the Middlebury benchmark for optical flow, run python benchmark.py. You can use it to easily verify that the provided implementation runs as expected.

video

license

Please refer to the appropriate file within this repository.

references

[1]  @inproceedings{Niklaus_WACV_2021,
         author = {Simon Niklaus and Long Mai and Oliver Wang},
         title = {Revisiting Adaptive Convolutions for Video Frame Interpolation},
         booktitle = {IEEE Winter Conference on Applications of Computer Vision},
         year = {2021}
     }

[2]  @inproceedings{Niklaus_ICCV_2017,
         author = {Simon Niklaus and Long Mai and Feng Liu},
         title = {Video Frame Interpolation via Adaptive Separable Convolution},
         booktitle = {IEEE International Conference on Computer Vision},
         year = {2017}
     }

[3]  @inproceedings{Niklaus_CVPR_2017,
         author = {Simon Niklaus and Long Mai and Feng Liu},
         title = {Video Frame Interpolation via Adaptive Convolution},
         booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
         year = {2017}
     }

an implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation using PyTorch

Related tags

Overview

revisiting-sepconv

setup

usage

video

license

references

Owner

Simon Niklaus

This repository contains the implementation of the following paper: Cross-Descriptor Visual Localization and Mapping

Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.

A high-performance distributed deep learning system targeting large-scale and automated distributed training.

Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

Pomodoro timer that acknowledges the inexorable, infinite passage of time

Doosan robotic arm, simulation, control, visualization in Gazebo and ROS2 for Reinforcement Learning.

Multi-layer convolutional LSTM with Pytorch

[TOG 2021] PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling.

Dynamic View Synthesis from Dynamic Monocular Video

Model Zoo of BDD100K Dataset

YoloAll is a collection of yolo all versions. you you use YoloAll to test yolov3/yolov5/yolox/yolo_fastest

Scalable training for dense retrieval models.

This is a tensorflow-based rotation detection benchmark, also called AlphaRotate.

Official code for: A Probabilistic Hard Attention Model For Sequentially Observed Scenes

A Human-in-the-Loop workflow for creating HD images from text

Translate darknet to tensorflow. Load trained weights, retrain/fine-tune using tensorflow, export constant graph def to mobile devices

Pytorch version of SfmLearner from Tinghui Zhou et al.

Pytorch implementation for DFN: Distributed Feedback Network for Single-Image Deraining.

We simulate traveling back in time with a modern camera to rephotograph famous historical subjects.

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021))