RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

Overview

RIFE - Real Time Video Interpolation

arXiv | YouTube | Colab | Tutorial | Demo

Table of Contents

  1. Introduction
  2. Collection
  3. Usage
  4. Evaluation
  5. Training and Reproduction
  6. Citation
  7. Reference
  8. Sponsor

Introduction

This project is the implement of RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation. If you are a developer, welcome to follow Practical-RIFE, which aims to make RIFE more practical for users by adding various features and design new models.

Currently, our model can run 30+FPS for 2X 720p interpolation on a 2080Ti GPU. It supports 2X,4X,8X... interpolation, and multi-frame interpolation between a pair of images.

16X interpolation results from two input images:

Demo Demo

Software

Squirrel-RIFE(中文软件) | Waifu2x-Extension-GUI | Flowframes | RIFE-ncnn-vulkan | RIFE-App(Paid) | Autodesk Flame | SVP |

CLI Usage

Installation

git clone [email protected]:hzwer/arXiv2020-RIFE.git
cd arXiv2020-RIFE
pip3 install -r requirements.txt

Run

Video Frame Interpolation

You can use our demo video or your own video.

python3 inference_video.py --exp=1 --video=video.mp4 

(generate video_2X_xxfps.mp4)

python3 inference_video.py --exp=2 --video=video.mp4

(for 4X interpolation)

python3 inference_video.py --exp=1 --video=video.mp4 --scale=0.5

(If your video has very high resolution such as 4K, we recommend set --scale=0.5 (default 1.0). If you generate disordered pattern on your videos, try set --scale=2.0. This parameter control the process resolution for optical flow model.)

python3 inference_video.py --exp=2 --img=input/

(to read video from pngs, like input/0.png ... input/612.png, ensure that the png names are numbers)

python3 inference_video.py --exp=2 --video=video.mp4 --fps=60

(add slomo effect, the audio will be removed)

python3 inference_video.py --video=video.mp4 --montage --png

(if you want to montage the origin video, skip static frames and save the png format output)

The warning info, 'Warning: Your video has *** static frames, it may change the duration of the generated video.' means that your video has changed the frame rate by adding static frames, it is common if you have processed 25FPS video to 30FPS.

Image Interpolation

python3 inference_img.py --img img0.png img1.png --exp=4

(2^4=16X interpolation results) After that, you can use pngs to generate mp4:

ffmpeg -r 10 -f image2 -i output/img%d.png -s 448x256 -c:v libx264 -pix_fmt yuv420p output/slomo.mp4 -q:v 0 -q:a 0

You can also use pngs to generate gif:

ffmpeg -r 10 -f image2 -i output/img%d.png -s 448x256 -vf "split[s0][s1];[s0]palettegen=stats_mode=single[p];[s1][p]paletteuse=new=1" output/slomo.gif

Run in docker

Place the pre-trained models in train_log/\*.pkl (as above)

Building the container:

docker build -t rife -f docker/Dockerfile .

Running the container:

docker run --rm -it -v $PWD:/host rife:latest inference_video --exp=1 --video=untitled.mp4 --output=untitled_rife.mp4
docker run --rm -it -v $PWD:/host rife:latest inference_img --img img0.png img1.png --exp=4

Using gpu acceleration (requires proper gpu drivers for docker):

docker run --rm -it --gpus all -v /dev/dri:/dev/dri -v $PWD:/host rife:latest inference_video --exp=1 --video=untitled.mp4 --output=untitled_rife.mp4

Evaluation

Download RIFE model reported by our paper.

UCF101: Download UCF101 dataset at ./UCF101/ucf101_interp_ours/

Vimeo90K: Download Vimeo90K dataset at ./vimeo_interp_test

MiddleBury: Download MiddleBury OTHER dataset at ./other-data and ./other-gt-interp

HD: Download HD dataset at ./HD_dataset. We also provide a google drive download link.

# RIFE
python3 benchmark/UCF101.py
# "PSNR: 35.282 SSIM: 0.9688"
python3 benchmark/Vimeo90K.py
# "PSNR: 35.615 SSIM: 0.9779"
python3 benchmark/MiddleBury_Other.py
# "IE: 1.956"
python3 benchmark/HD.py
# "PSNR: 32.14"
python3 benchmark/HD_multi.py
# "PSNR: 18.60(544*1280), 29.02(720p), 24.73(1080p)"

Training and Reproduction

Download Vimeo90K dataset.

We use 16 CPUs, 4 GPUs and 20G memory for training:

python3 -m torch.distributed.launch --nproc_per_node=4 train.py --world_size=4

Citation

@article{huang2020rife,
  title={RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation},
  author={Huang, Zhewei and Zhang, Tianyuan and Heng, Wen and Shi, Boxin and Zhou, Shuchang},
  journal={arXiv preprint arXiv:2011.06294},
  year={2020}
}

Reference

Optical Flow: ARFlow pytorch-liteflownet RAFT pytorch-PWCNet

Video Interpolation: DVF TOflow SepConv DAIN CAIN MEMC-Net SoftSplat BMBC EDSC EQVI

Sponsor

感谢支持 Paypal Sponsor: https://www.paypal.com/paypalme/hzwer

imageimage

Comments
  • Welcome to try v3.8 model

    Welcome to try v3.8 model

    Based on the evaluation of dozens of videos, the v3.8 model has achieved an acceleration effect of more than 2X while surpassing the effect of the RIFEv2.4 model. And v3.8 can better handle 2d scenes. At the same time, we welcome you to submit bad cases to help us in the future model improvement.

    v3.8 model: https://github.com/hzwer/Practical-RIFE#model-list

    opened by hzwer 23
  • 24 to 60 fps?

    24 to 60 fps?

    Hi there,

    RIFE looks fantastic, but as fair as I know I only can enter integer numbers as scale factor, correct? So when I want to interpolate 24 fps to 60 (by far the most common case I suppose) I know no other way than interpolating to 120 (factor 5) and then drop any other frame to get 60.

    But even that doesn't seem to be possible as supported scale factors are only 2x, 4x, 8x (no 5x option).

    So, is RIFE able to make 24 fps movie content run smooth on 60 Hz displays?

    opened by spyro2000 22
  • can't train because torch incompatible with python version

    can't train because torch incompatible with python version

    /home/france1/.local/lib/python3.9/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated
    and will be removed in future. Use torch.distributed.run.
    Note that --use_env is set by default in torch.distributed.run.
    If your script expects `--local_rank` argument to be set, please
    change it to read from `os.environ['LOCAL_RANK']` instead. See 
    https://pytorch.org/docs/stable/distributed.html#launch-utility for 
    further instructions
    
      warnings.warn(
    WARNING:torch.distributed.run:*****************************************
    Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
    *****************************************
    Traceback (most recent call last):
    Traceback (most recent call last):
      File "/home/france1/arXiv2020-RIFE/train.py", line 140, in <module>
    Traceback (most recent call last):
      File "/home/france1/arXiv2020-RIFE/train.py", line 140, in <module>
      File "/home/france1/arXiv2020-RIFE/train.py", line 140, in <module>
        torch.cuda.set_device(args.local_rank)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/cuda/__init__.py", line 264, in set_device
        torch.cuda.set_device(args.local_rank)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/cuda/__init__.py", line 264, in set_device
        torch.cuda.set_device(args.local_rank)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/cuda/__init__.py", line 264, in set_device
        torch._C._cuda_setDevice(device)
        RuntimeErrortorch._C._cuda_setDevice(device): 
    CUDA error: invalid device ordinal
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    RuntimeError: CUDA error: invalid device ordinal
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
        torch._C._cuda_setDevice(device)
    RuntimeError: CUDA error: invalid device ordinal
    CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
    For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 1 (pid: 651166) of binary: /usr/bin/python3
    Traceback (most recent call last):
      File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
        return _run_code(code, main_globals, None,
      File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
        exec(code, run_globals)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/launch.py", line 193, in <module>
        main()
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/launch.py", line 189, in main
        launch(args)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/launch.py", line 174, in launch
        run(args)
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/run.py", line 689, in run
        elastic_launch(
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 116, in __call__
        return launch_agent(self._config, self._entrypoint, list(args))
      File "/home/france1/.local/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 244, in launch_agent
        raise ChildFailedError(
    torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
    ***************************************
                train.py FAILED            
    =======================================
    Root Cause:
    [0]:
      time: 2021-09-30_17:35:27
      rank: 1 (local_rank: 1)
      exitcode: 1 (pid: 651166)
      error_file: <N/A>
      msg: "Process failed with exitcode 1"
    =======================================
    Other Failures:
    [1]:
      time: 2021-09-30_17:35:27
      rank: 2 (local_rank: 2)
      exitcode: 1 (pid: 651167)
      error_file: <N/A>
      msg: "Process failed with exitcode 1"
    [2]:
      time: 2021-09-30_17:35:27
      rank: 3 (local_rank: 3)
      exitcode: 1 (pid: 651168)
      error_file: <N/A>
      msg: "Process failed with exitcode 1"
    ***************************************
    
    
    opened by arch-user-france1 19
  • Transparent PNG support

    Transparent PNG support

    Seeing that recently EXR support was added, is it possible to support transparency (alpha channel) for PNG input and output (using --img --png) for inference_video.py?

    This would enable interpolation of transparent GIFs.

    opened by n00mkrad 19
  • Image sequence and input

    Image sequence and input

         Thanks for adding the png output function. Can you make the output name to be consistent with ffmpeg ? i.e. 0000.png 0001.png ----- 7821.png.And then we can use ffmpeg to deal with image sequence.
         Adding image sequence input would also be great.
    
    opened by Michaelwhite34 15
  • 问题: Dataset有multiframes的时候,该如何prepare

    问题: Dataset有multiframes的时候,该如何prepare

    你好,首先非常感谢在github上共享这个repo!

    在用您release的model运行inference之后,我想尝试用customer dataset来训练。

    我的dataset每一个video里面有24个frames, 所以目标就是生成中间的22个frames.

    我参考了 一下在dataset.py 中的 VimeoDataset class, 发现在prepare的data的时候, 因为这个dataset每个video只有3个frames,所以return的都是 第一个和最后一个frame,要求interpolate的是中间的frame.

    • 想请问一下如果我想interpolate多个frames,是有可能实现的吗?

    目前我已经开始训练了,我大概做了一个稍微的调整,就是input是第一个和最后一个frame, 要求predict中间的那个frame (第十二个frame),然后Model的选择,我选的是 RIFE.pyIFNet.py,对应的生成Model应该是最robust的(42.9MB). 因为我的数据集比较单一,为了防止overfitting,我先预先load了你们release的Model, 然后继续训练。

    • 但是我发现loss在几十个epoch之后,出现了井喷的状态,最后生成的Model在做inference的时候,完全生成不了中间的22个frames(全部都是黑图), 跟我一开始用您release的Model运行的结果相差甚远...

    后来我尝试用另一个Model训练(RIFE_HDV3.pyIFNet_HDV3.py)(生成Model是12.2MB), 但是pytorch一直报错。

    -RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by (1) passing the keyword argumentfind_unused_parameters=Truetotorch.nn.parallel.DistributedDataParallel; (2) making sure allforwardfunction outputs participate in calculating loss. If you already have done the above two steps, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module'sforwardfunction. Please include the loss function and the structure of the return value offorwardof your module when reporting this issue (e.g. list, dict, iterable). 错误源头是在RIFE_HDV3.py中的 update()里的 flow, mask, merged = self.flownet(imgs, scale=[4,2,1]) 我查了一下,出现这个错误原因是因为在 forward()的output里有些variables没有用来calculate loss. 我又去仔仔细细的查看了一下IFNet_HDV3下的forward() , 还是无果..

    如果您有好的建议的话,不甚感谢!

    opened by chenyuZha 13
  • Not the fastest for multi-frame interpolation

    Not the fastest for multi-frame interpolation

    Hi,

    Thanks for open sourcing the code and contributing to the video frame interpolation community.

    In the paper, it mentioned: "Coupled with the large complexity in the bi-directional flow estimation, none of these methods can achieve real-time speed"

    I believe that might be inappropriate to say, as the recent published paper (https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123720103.pdf) targets efficient multi-frame interpolation.

    It utilizes bi-directional flow estimation as well, but it generates 7 frames for 0.12 second. where your method requires 0.036 * 7 = 0.252 seconds.

    And the model from that paper is compact, which consists of only ~2M parameters, where your fast model has ~10M parameters.

    opened by mrchizx 13
  • 关于hdv2和hdv3模型的复现

    关于hdv2和hdv3模型的复现

    您好, 我复现了hdv2和hdv3模型,但是和您提供的结果总有一些差距。 配置超参数: weight_decay=1e-4, learing rate:3e-4 *mul ,vgg loss,还有 #146 中提到的数据增广方式,patch size 256,训练300epoch。 我看您在hdv2和v3提供的版本中,模型结构都没有变化,它们都有什么区别呢?除了我上面的配置,想要达到您的效果还有哪些没有做到呢?

    opened by tqyunwuxin 12
  • Model v2 update log

    Model v2 update log

    We show some hard case results for every version model. v2 google drive download link: (https://drive.google.com/file/d/1wsQIhHZ3Eg4_AfCXItFKqqyDMB4NS0Yd/view).

    v1.1 2020.11.16 链接:https://pan.baidu.com/s/1SPRw_u3zjaufn7egMr19Eg 密码:orkd imageimage

    opened by hzwer 12
  • Training with other datasets

    Training with other datasets

    Has anyone trained RIFE_HDv2 with training set other than vimeo dataset: such as HD dataset. I

    And were they able to get better visual quality for HD content.

    opened by rsjjdesj 11
  • replicating benchmarks

    replicating benchmarks

    Thank you for sharing your code! I was trying to replicate the numbers you stated in your paper using this implementation but have unfortunately been unsuccessful so far. Would you be able to share a script that can be used to replicate the Vimeo-90k metrics you quoted? Also, I think the following padding has some issues.

    https://github.com/hzwer/arXiv2020-RIFE/blob/3194107170d6613b2ea924aa35bb57e5913fff44/inference_img.py#L26-L28

    https://github.com/hzwer/arXiv2020-RIFE/blob/3194107170d6613b2ea924aa35bb57e5913fff44/inference_img.py#L45

    The pw - w and [:h, :w] indicate that pw > w (and ph > h). However, pw = 340 // 32 * 32 = 320 for w = 340 which violates this condition. Thanks for looking into this and thanks again for sharing your code!

    opened by sniklaus 11
  • Reproducibility results

    Reproducibility results

    Hi,

    I checked if I can reproduce the results similar to those in the paper to make sure I am training the model properly. These are the results that I got on Vimeo triplets:

    interpol_flow

    The model prediction is shown for t=1 (predicting between t=0 and t=2), the second row corresponds to interpolation ("Interpol"), and the last row to flow ("Flow pred"). I see that interpolation results are very good, however, I expected the flow to be a bit more accurate. In section 6.2 of the appendix you mention that "IFNet produces clear motion boundaries.", this is also what can be seen in Figure 10. Therefore I wanted to ask if there were any other training steps that I need to add to get flow prediction more accurate. I can of course share more prediction examples that I got.

    The training was done for 300 epochs using the reconstruction losses and distillation loss (the latter is with coeff. 0.01) as described in the paper, I didn't change anything in the code to train and obtain these results. This is the loss plot: val_loss_300

    I would appreciate if you can confirm that you trained the model the same way and that your flow predictions look similar. I used IFNet for training (self.flownet = IFNet()).

    opened by HamidGadirov 4
  • How to visualize flow that model inferenced ?

    How to visualize flow that model inferenced ?

    flow, mask, merged = self.flownet(imgs, scale_list) flow must be the flow between two images. it's shape (bs, 4, H, W) , how to visualize it ? like this: image and how to generate the flow groudtruth ? Thx !

    opened by zhishao 12
  • RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 1

    RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 1

    hi, im trying to interpolate a png sequence. they are properly numbered and such. here is my console:

    conda run -n RIFE py D:\Development\RIFE\inference_video.py --img "D:\Game Assets\Super Outrun Rush\Animations\Pixelated\WadeCharge" --exp=4
    Loaded v3.x HD model.
    
      0%|          | 0/16 [00:00<?, ?it/s]Traceback (most recent call last):
      File "D:\Development\RIFE\inference_video.py", line 259, in <module>
        output = make_inference(I0, I1, 2**args.exp-1) if args.exp else []
      File "D:\Development\RIFE\inference_video.py", line 180, in make_inference
        middle = model.inference(I0, I1, args.scale)
      File "D:\Development\RIFE\train_log\RIFE_HDv3.py", line 58, in inference
        flow, mask, merged = self.flownet(imgs, scale_list)
      File "C:\Users\Jackson\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
        return forward_call(*input, **kwargs)
      File "D:\Development\RIFE\train_log\IFNet_HDv3.py", line 113, in forward
        merged[i] = merged[i][0] * mask_list[i] + merged[i][1] * (1 - mask_list[i])
    RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 1
    
    opened by Apple-Fritter-Money-Entertainment 0
Releases(arxiv_v5_code)
Owner
hzwer
hzwer
AlphaBot2 Pi Core software for interfacing with the various components.

AlphaBot2-Pi-Core AlphaBot2 Pi Core software for interfacing with the various components. This project is currently a W.I.P. I will update this readme

KyleDev 1 Feb 13, 2022
A python script to convert images to animated sus among us crewmate twerk jifs as seen on r/196

img_sussifier A python script to convert images to animated sus among us crewmate twerk jifs as seen on r/196 Examples How to use install python pip i

41 Sep 30, 2022
An adaptive hierarchical energy management strategy for hybrid electric vehicles

An adaptive hierarchical energy management strategy This project contains the source code of an adaptive hierarchical EMS combining heuristic equivale

19 Dec 13, 2022
An End-to-End Machine Learning Library to Optimize AUC (AUROC, AUPRC).

Logo by Zhuoning Yuan LibAUC: A Machine Learning Library for AUC Optimization Website | Updates | Installation | Tutorial | Research | Github LibAUC a

Optimization for AI 176 Jan 07, 2023
SCU OlympicsRunning Baseline

Competition 1v1 running Environment check details in Jidi Competition RLChina2021智能体竞赛 做出的修改: 奖励重塑:修改了环境,重新设置了奖励的分配,使得奖励组成不只有零和博弈,还有探索环境的奖励。 算法微调:修改了官

ZiSeoi Wong 2 Nov 23, 2021
A Nim frontend for pytorch, aiming to be mostly auto-generated and internally using ATen.

Master Release Pytorch - Py + Nim A Nim frontend for pytorch, aiming to be mostly auto-generated and internally using ATen. Because Nim compiles to C+

Giovanni Petrantoni 425 Dec 22, 2022
Implementation of "GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings" in PyTorch

PyGAS: Auto-Scaling GNNs in PyG PyGAS is the practical realization of our G NN A uto S cale (GAS) framework, which scales arbitrary message-passing GN

Matthias Fey 139 Dec 25, 2022
Simple and Distributed Machine Learning

Synapse Machine Learning SynapseML (previously MMLSpark) is an open source library to simplify the creation of scalable machine learning pipelines. Sy

Microsoft 3.9k Dec 30, 2022
A scikit-learn compatible neural network library that wraps PyTorch

A scikit-learn compatible neural network library that wraps PyTorch. Resources Documentation Source Code Examples To see more elaborate examples, look

4.9k Dec 31, 2022
ADSPM: Attribute-Driven Spontaneous Motion in Unpaired Image Translation

ADSPM: Attribute-Driven Spontaneous Motion in Unpaired Image Translation This repository provides a PyTorch implementation of ADSPM. Requirements Pyth

24 Jul 24, 2022
Robust Consistent Video Depth Estimation

[CVPR 2021] Robust Consistent Video Depth Estimation This repository contains Python and C++ implementation of Robust Consistent Video Depth, as descr

Facebook Research 213 Dec 17, 2022
The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization".

Codebase for learning control flow in transformers The official repository for our paper "The Neural Data Router: Adaptive Control Flow in Transformer

Csordás Róbert 24 Oct 15, 2022
ImageNet Adversarial Image Evaluation

ImageNet Adversarial Image Evaluation This repository contains the code and some materials used in the experimental work presented in the following pa

Utku Ozbulak 11 Dec 26, 2022
yolox_backbone is a deep-learning library and is a collection of YOLOX Backbone models.

YOLOX-Backbone yolox-backbone is a deep-learning library and is a collection of YOLOX backbone models. Install pip install yolox-backbone Load a Pret

Yonghye Kwon 21 Dec 28, 2022
Akshat Surolia 2 May 11, 2022
The Habitat-Matterport 3D Research Dataset - the largest-ever dataset of 3D indoor spaces.

Habitat-Matterport 3D Dataset (HM3D) The Habitat-Matterport 3D Research Dataset is the largest-ever dataset of 3D indoor spaces. It consists of 1,000

Meta Research 62 Dec 27, 2022
Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020

XDVioDet Official implementation of "Not only Look, but also Listen: Learning Multimodal Violence Detection under Weak Supervision" ECCV2020. The proj

peng 64 Dec 12, 2022
U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."

Xuebin Qin 6.5k Jan 09, 2023
PyTorch implementation of DCT fast weight RNNs

DCT based fast weights This repository contains the official code for the paper: Training and Generating Neural Networks in Compressed Weight Space. T

Kazuki Irie 4 Dec 24, 2022
The aim of this project is to build an AI bot that can play the Wordle game, or more generally Squabble

Wordle RL The aim of this project is to build an AI bot that can play the Wordle game, or more generally Squabble I know there are more deterministic

Aditya Arora 3 Feb 22, 2022