Tensorflow 2 implementation of our high quality frame interpolation neural network

Overview

FILM: Frame Interpolation for Large Scene Motion

Project | Paper | YouTube | Benchmark Scores

Tensorflow 2 implementation of our high quality frame interpolation neural network. We present a unified single-network approach that doesn't use additional pre-trained networks, like optical flow or depth, and yet achieve state-of-the-art results. We use a multi-scale feature extractor that shares the same convolution weights across the scales. Our model is trainable from frame triplets alone.

FILM: Frame Interpolation for Large Motion
Fitsum Reda, Janne Kontkanen, Eric Tabellion, Deqing Sun, Caroline Pantofaru, Brian Curless
Google Research
Technical Report 2022.

A sample 2 seconds moment. FILM transforms near-duplicate photos into a slow motion footage that look like it is shot with a video camera.

Installation

  • Get Frame Interpolation source codes
> git clone https://github.com/google-research/frame-interpolation frame_interpolation
  • Optionally, pull the recommended Docker base image
> docker pull gcr.io/deeplearning-platform-release/tf2-gpu.2-6:latest
  • Install dependencies
> pip install -r frame_interpolation/requirements.txt
> apt-get install ffmpeg

Pre-trained Models

  • Create a directory where you can keep large files. Ideally, not in this directory.
> mkdir 
   

   
  • Download pre-trained TF2 Saved Models from google drive and put into .

The downloaded folder should have the following structure:

pretrained_models/
├── film_net/
│   ├── L1/
│   ├── VGG/
│   ├── Style/
├── vgg/
│   ├── imagenet-vgg-verydeep-19.mat

Running the Codes

The following instructions run the interpolator on the photos provided in frame_interpolation/photos.

One mid-frame interpolation

To generate an intermediate photo from the input near-duplicate photos, simply run:

> python3 -m frame_interpolation.eval.interpolator_test \
     --frame1 frame_interpolation/photos/one.png \
     --frame2 frame_interpolation/photos/two.png \
     --model_path 
   
    /film_net/Style/saved_model \
     --output_frame frame_interpolation/photos/middle.png \

   

This will produce the sub-frame at t=0.5 and save as 'frame_interpolation/photos/middle.png'.

Many in-between frames interpolation

Takes in a set of directories identified by a glob (--pattern). Each directory is expected to contain at least two input frames, with each contiguous frame pair treated as an input to generate in-between frames.

/film_net/Style/saved_model \ --times_to_interpolate 6 \ --output_video">
> python3 -m frame_interpolation.eval.interpolator_cli \
     --pattern "frame_interpolation/photos" \
     --model_path 
   
    /film_net/Style/saved_model \
     --times_to_interpolate 6 \
     --output_video

   

You will find the interpolated frames (including the input frames) in 'frame_interpolation/photos/interpolated_frames/', and the interpolated video at 'frame_interpolation/photos/interpolated.mp4'.

The number of frames is determined by --times_to_interpolate, which controls the number of times the frame interpolator is invoked. When the number of frames in a directory is 2, the number of output frames will be 2^times_to_interpolate+1.

Datasets

We use Vimeo-90K as our main training dataset. For quantitative evaluations, we rely on commonly used benchmark datasets, specifically:

Creating a TFRecord

The training and benchmark evaluation scripts expect the frame triplets in the TFRecord storage format.

We have included scripts that encode the relevant frame triplets into a tf.train.Example data format, and export to a TFRecord file.

You can use the commands python3 -m frame_interpolation.datasets.create_ _tfrecord --help for more information.

For example, run the command below to create a TFRecord for the Middlebury-other dataset. Download the images and point --input_dir to the unzipped folder path.

> python3 -m frame_interpolation.datasets.create_middlebury_tfrecord \
    --input_dir=
   
     \
    --output_tfrecord_filepath=
    

   

Training

Below are our training gin configuration files for the different loss function:

frame_interpolation/training/
├── config/
│   ├── film_net-L1.gin
│   ├── film_net-VGG.gin
│   ├── film_net-Style.gin

To launch a training, simply pass the configuration filepath to the desired experiment.
By default, it uses all visible GPUs for training. To debug or train on a CPU, append --mode cpu.

> python3 -m frame_interpolation.training.train \
     --gin_config frame_interpolation/training/config/
   
    .gin \
     --base_folder 
     \
     --label 
    

    
   
  • When training finishes, the folder structure will look like this:

   
    /
├── 
    
   

Build a SavedModel

Optionally, to build a SavedModel format from a trained checkpoints folder, you can use this command:

> python3 -m frame_interpolation.training.build_saved_model_cli \
     --base_folder  \
     --label 
   

   
  • By default, a SavedModel is created when the training loop ends, and it will be saved at / .

Evaluation on Benchmarks

Below, we provided the evaluation gin configuration files for the benchmarks we have considered:

frame_interpolation/eval/
├── config/
│   ├── middlebury.gin
│   ├── ucf101.gin
│   ├── vimeo_90K.gin
│   ├── xiph_2K.gin
│   ├── xiph_4K.gin

To run an evaluation, simply pass the configuration file of the desired evaluation dataset.
If a GPU is visible, it runs on it.

> python3 -m frame_interpolation.eval.eval_cli -- \
     --gin_config frame_interpolation/eval/config/
   
    .gin \
     --model_path 
    
     /film_net/L1/saved_model

    
   

The above command will produce the PSNR and SSIM scores presented in the paper.

Citation

If you find this implementation useful in your works, please acknowledge it appropriately by citing:

@inproceedings{reda2022film,
 title = {Frame Interpolation for Large Motion},
 author = {Fitsum Reda and Janne Kontkanen and Eric Tabellion and Deqing Sun and Caroline Pantofaru and Brian Curless},
 booktitle = {arXiv},
 year = {2022}
}
@misc{film-tf,
  title = {Tensorflow 2 Implementation of "FILM: Frame Interpolation for Large Scene Motion"},
  author = {Fitsum Reda and Janne Kontkanen and Eric Tabellion and Deqing Sun and Caroline Pantofaru and Brian Curless},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/google-research/frame-interpolation}}
}

Contact: Fitsum Reda ([email protected])

Acknowledgments

We would like to thank Richard Tucker, Jason Lai and David Minnen. We would also like to thank Jamie Aspinall for the imagery included in this repository.

Coding style

  • 2 spaces for indentation
  • 80 character line length
  • PEP8 formatting

Disclaimer

This is not an officially supported Google product.

SparseML is a libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

SparseML is a toolkit that includes APIs, CLIs, scripts and libraries that apply state-of-the-art sparsification algorithms such as pruning and quantization to any neural network. General, recipe-dri

Neural Magic 1.5k Dec 30, 2022
Layered Neural Atlases for Consistent Video Editing

Layered Neural Atlases for Consistent Video Editing Project Page | Paper This repository contains an implementation for the SIGGRAPH Asia 2021 paper L

Yoni Kasten 353 Dec 27, 2022
Learning Temporal Consistency for Low Light Video Enhancement from Single Images (CVPR2021)

StableLLVE This is a Pytorch implementation of "Learning Temporal Consistency for Low Light Video Enhancement from Single Images" in CVPR 2021, by Fan

99 Dec 19, 2022
[AAAI2021] The source code for our paper 《Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion》.

DSM The source code for paper Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion Project Website; Datasets li

Jinpeng Wang 114 Oct 16, 2022
Compartmental epidemic model to assess undocumented infections: applications to SARS-CoV-2 epidemics in Brazil - Datasets and Codes

Compartmental epidemic model to assess undocumented infections: applications to SARS-CoV-2 epidemics in Brazil - Datasets and Codes The codes for simu

1 Jan 12, 2022
A Simple and Versatile Framework for Object Detection and Instance Recognition

SimpleDet - A Simple and Versatile Framework for Object Detection and Instance Recognition Major Features FP16 training for memory saving and up to 2.

TuSimple 3k Dec 12, 2022
For the paper entitled ''A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion Mining''

Summary This is the source code for the paper "A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion Mining", which was accepted as fu

1 Nov 10, 2021
A state of the art of new lightweight YOLO model implemented by TensorFlow 2.

CSL-YOLO: A New Lightweight Object Detection System for Edge Computing This project provides a SOTA level lightweight YOLO called "Cross-Stage Lightwe

Miles Zhang 54 Dec 21, 2022
Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation This reposi

First Person Vision @ Image Processing Laboratory - University of Catania 1 Aug 21, 2022
PyTorch wrapper for Taichi data-oriented class

Stannum PyTorch wrapper for Taichi data-oriented class PRs are welcomed, please see TODOs. Usage from stannum import Tin import torch data_oriented =

86 Dec 23, 2022
Code for "Single-view robot pose and joint angle estimation via render & compare", CVPR 2021 (Oral).

Single-view robot pose and joint angle estimation via render & compare Yann Labbé, Justin Carpentier, Mathieu Aubry, Josef Sivic CVPR: Conference on C

Yann Labbé 51 Oct 14, 2022
OpenMMLab Computer Vision Foundation

English | 简体中文 Introduction MMCV is a foundational library for computer vision research and supports many research projects as below: MMCV: OpenMMLab

OpenMMLab 4.6k Jan 09, 2023
Assessing the Influence of Models on the Performance of Reinforcement Learning Algorithms applied on Continuous Control Tasks

Assessing the Influence of Models on the Performance of Reinforcement Learning Algorithms applied on Continuous Control Tasks This is the master thesi

Giacomo Arcieri 1 Mar 21, 2022
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Fre-GAN Vocoder Fre-GAN: Adversarial Frequency-consistent Audio Synthesis Training: python train.py --config config.json Citation: @misc{kim2021frega

Rishikesh (ऋषिकेश) 93 Dec 17, 2022
CoRe: Contrastive Recurrent State-Space Models

CoRe: Contrastive Recurrent State-Space Models This code implements the CoRe model and reproduces experimental results found in Robust Robotic Control

Apple 21 Aug 11, 2022
Meta Learning for Semi-Supervised Few-Shot Classification

few-shot-ssl-public Code for paper Meta-Learning for Semi-Supervised Few-Shot Classification. [arxiv] Dependencies cv2 numpy pandas python 2.7 / 3.5+

Mengye Ren 501 Jan 08, 2023
Human pose estimation from video plays a critical role in various applications such as quantifying physical exercises, sign language recognition, and full-body gesture control.

Pose Detection Project Description: Human pose estimation from video plays a critical role in various applications such as quantifying physical exerci

Hassan Shahzad 2 Jan 17, 2022
Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!

Rubicon Purpose Rubicon is a data science tool that captures and stores model training and execution information, like parameters and outcomes, in a r

Capital One 97 Jan 03, 2023
Few-shot NLP benchmark for unified, rigorous eval

FLEX FLEX is a benchmark and framework for unified, rigorous few-shot NLP evaluation. FLEX enables: First-class NLP support Support for meta-training

AI2 85 Dec 03, 2022
Exadel CompreFace is a free and open-source face recognition GitHub project

Exadel CompreFace is a leading free and open-source face recognition system Exadel CompreFace is a free and open-source face recognition service that

Exadel 2.6k Jan 04, 2023