Tensorflow 2 implementation of our high quality frame interpolation neural network

Overview

FILM: Frame Interpolation for Large Scene Motion

Project | Paper | YouTube | Benchmark Scores

Tensorflow 2 implementation of our high quality frame interpolation neural network. We present a unified single-network approach that doesn't use additional pre-trained networks, like optical flow or depth, and yet achieve state-of-the-art results. We use a multi-scale feature extractor that shares the same convolution weights across the scales. Our model is trainable from frame triplets alone.

FILM: Frame Interpolation for Large Motion
Fitsum Reda, Janne Kontkanen, Eric Tabellion, Deqing Sun, Caroline Pantofaru, Brian Curless
Google Research
Technical Report 2022.

A sample 2 seconds moment. FILM transforms near-duplicate photos into a slow motion footage that look like it is shot with a video camera.

Installation

  • Get Frame Interpolation source codes
> git clone https://github.com/google-research/frame-interpolation frame_interpolation
  • Optionally, pull the recommended Docker base image
> docker pull gcr.io/deeplearning-platform-release/tf2-gpu.2-6:latest
  • Install dependencies
> pip install -r frame_interpolation/requirements.txt
> apt-get install ffmpeg

Pre-trained Models

  • Create a directory where you can keep large files. Ideally, not in this directory.
> mkdir 
   

   
  • Download pre-trained TF2 Saved Models from google drive and put into .

The downloaded folder should have the following structure:

pretrained_models/
├── film_net/
│   ├── L1/
│   ├── VGG/
│   ├── Style/
├── vgg/
│   ├── imagenet-vgg-verydeep-19.mat

Running the Codes

The following instructions run the interpolator on the photos provided in frame_interpolation/photos.

One mid-frame interpolation

To generate an intermediate photo from the input near-duplicate photos, simply run:

> python3 -m frame_interpolation.eval.interpolator_test \
     --frame1 frame_interpolation/photos/one.png \
     --frame2 frame_interpolation/photos/two.png \
     --model_path 
   
    /film_net/Style/saved_model \
     --output_frame frame_interpolation/photos/middle.png \

   

This will produce the sub-frame at t=0.5 and save as 'frame_interpolation/photos/middle.png'.

Many in-between frames interpolation

Takes in a set of directories identified by a glob (--pattern). Each directory is expected to contain at least two input frames, with each contiguous frame pair treated as an input to generate in-between frames.

/film_net/Style/saved_model \ --times_to_interpolate 6 \ --output_video">
> python3 -m frame_interpolation.eval.interpolator_cli \
     --pattern "frame_interpolation/photos" \
     --model_path 
   
    /film_net/Style/saved_model \
     --times_to_interpolate 6 \
     --output_video

   

You will find the interpolated frames (including the input frames) in 'frame_interpolation/photos/interpolated_frames/', and the interpolated video at 'frame_interpolation/photos/interpolated.mp4'.

The number of frames is determined by --times_to_interpolate, which controls the number of times the frame interpolator is invoked. When the number of frames in a directory is 2, the number of output frames will be 2^times_to_interpolate+1.

Datasets

We use Vimeo-90K as our main training dataset. For quantitative evaluations, we rely on commonly used benchmark datasets, specifically:

Creating a TFRecord

The training and benchmark evaluation scripts expect the frame triplets in the TFRecord storage format.

We have included scripts that encode the relevant frame triplets into a tf.train.Example data format, and export to a TFRecord file.

You can use the commands python3 -m frame_interpolation.datasets.create_ _tfrecord --help for more information.

For example, run the command below to create a TFRecord for the Middlebury-other dataset. Download the images and point --input_dir to the unzipped folder path.

> python3 -m frame_interpolation.datasets.create_middlebury_tfrecord \
    --input_dir=
   
     \
    --output_tfrecord_filepath=
    

   

Training

Below are our training gin configuration files for the different loss function:

frame_interpolation/training/
├── config/
│   ├── film_net-L1.gin
│   ├── film_net-VGG.gin
│   ├── film_net-Style.gin

To launch a training, simply pass the configuration filepath to the desired experiment.
By default, it uses all visible GPUs for training. To debug or train on a CPU, append --mode cpu.

> python3 -m frame_interpolation.training.train \
     --gin_config frame_interpolation/training/config/
   
    .gin \
     --base_folder 
     \
     --label 
    

    
   
  • When training finishes, the folder structure will look like this:

   
    /
├── 
    
   

Build a SavedModel

Optionally, to build a SavedModel format from a trained checkpoints folder, you can use this command:

> python3 -m frame_interpolation.training.build_saved_model_cli \
     --base_folder  \
     --label 
   

   
  • By default, a SavedModel is created when the training loop ends, and it will be saved at / .

Evaluation on Benchmarks

Below, we provided the evaluation gin configuration files for the benchmarks we have considered:

frame_interpolation/eval/
├── config/
│   ├── middlebury.gin
│   ├── ucf101.gin
│   ├── vimeo_90K.gin
│   ├── xiph_2K.gin
│   ├── xiph_4K.gin

To run an evaluation, simply pass the configuration file of the desired evaluation dataset.
If a GPU is visible, it runs on it.

> python3 -m frame_interpolation.eval.eval_cli -- \
     --gin_config frame_interpolation/eval/config/
   
    .gin \
     --model_path 
    
     /film_net/L1/saved_model

    
   

The above command will produce the PSNR and SSIM scores presented in the paper.

Citation

If you find this implementation useful in your works, please acknowledge it appropriately by citing:

@inproceedings{reda2022film,
 title = {Frame Interpolation for Large Motion},
 author = {Fitsum Reda and Janne Kontkanen and Eric Tabellion and Deqing Sun and Caroline Pantofaru and Brian Curless},
 booktitle = {arXiv},
 year = {2022}
}
@misc{film-tf,
  title = {Tensorflow 2 Implementation of "FILM: Frame Interpolation for Large Scene Motion"},
  author = {Fitsum Reda and Janne Kontkanen and Eric Tabellion and Deqing Sun and Caroline Pantofaru and Brian Curless},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/google-research/frame-interpolation}}
}

Contact: Fitsum Reda ([email protected])

Acknowledgments

We would like to thank Richard Tucker, Jason Lai and David Minnen. We would also like to thank Jamie Aspinall for the imagery included in this repository.

Coding style

  • 2 spaces for indentation
  • 80 character line length
  • PEP8 formatting

Disclaimer

This is not an officially supported Google product.

A PyTorch implementation of "ANEMONE: Graph Anomaly Detection with Multi-Scale Contrastive Learning", CIKM-21

ANEMONE A PyTorch implementation of "ANEMONE: Graph Anomaly Detection with Multi-Scale Contrastive Learning", CIKM-21 Dependencies python==3.6.1 dgl==

Graph Analysis & Deep Learning Laboratory, GRAND 30 Dec 14, 2022
My published benchmark for a Kaggle Simulations Competition

Lux AI Working Title Bot Please refer to the Kaggle notebook for the comment section. The comment section contains my explanation on my code structure

Tong Hui Kang 29 Aug 22, 2022
End-To-End Memory Network using Tensorflow

MemN2N Implementation of End-To-End Memory Networks with sklearn-like interface using Tensorflow. Tasks are from the bAbl dataset. Get Started git clo

Dominique Luna 339 Oct 27, 2022
The code for our NeurIPS 2021 paper "Kernelized Heterogeneous Risk Minimization".

Kernelized-HRM Jiashuo Liu, Zheyuan Hu The code for our NeurIPS 2021 paper "Kernelized Heterogeneous Risk Minimization"[1]. This repo contains the cod

Liu Jiashuo 8 Nov 20, 2022
Moer Grounded Image Captioning by Distilling Image-Text Matching Model

Moer Grounded Image Captioning by Distilling Image-Text Matching Model Requirements Python 3.7 Pytorch 1.2 Prepare data Please use git clone --recurse

YE Zhou 60 Dec 16, 2022
A PyTorch-Based Framework for Deep Learning in Computer Vision

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision @misc{you2019torchcv, author = {Ansheng You and Xiangtai Li and Zhen Zhu a

Donny You 2.2k Jan 09, 2023
The offcial repository for 'CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos', SIGIR2022

CharacterBERT-DR The offcial repository for CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos, Sh

ielab 11 Nov 15, 2022
Official PyTorch Implementation of GAN-Supervised Dense Visual Alignment

GAN-Supervised Dense Visual Alignment — Official PyTorch Implementation Paper | Project Page | Video This repo contains training, evaluation and visua

944 Jan 07, 2023
For visualizing the dair-v2x-i dataset

3D Detection & Tracking Viewer The project is based on hailanyi/3D-Detection-Tracking-Viewer and is modified, you can find the original version of the

34 Dec 29, 2022
PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

Zechen Bai 12 Jul 08, 2022
An example of time series augmentation methods with Keras

Time Series Augmentation This is a collection of time series data augmentation methods and an example use using Keras. News 2020/04/16: Repository Cre

九州大学 ヒューマンインタフェース研究室 229 Jan 02, 2023
Code, final versions, and information on the Sparkfun Graphical Datasheets

Graphical Datasheets Code, final versions, and information on the SparkFun Graphical Datasheets. Generated Cells After Running Script Example Complete

SparkFun Electronics 102 Jan 05, 2023
Algorithmic encoding of protected characteristics and its implications on disparities across subgroups

Algorithmic encoding of protected characteristics and its implications on disparities across subgroups This repository contains the code for the paper

Team MIRA - BioMedIA 15 Oct 24, 2022
《Rethinking Sptil Dimensions of Vision Trnsformers》(2021)

Rethinking Spatial Dimensions of Vision Transformers Byeongho Heo, Sangdoo Yun, Dongyoon Han, Sanghyuk Chun, Junsuk Choe, Seong Joon Oh | Paper NAVER

NAVER AI 224 Dec 27, 2022
Finetune alexnet with tensorflow - Code for finetuning AlexNet in TensorFlow >= 1.2rc0

Finetune AlexNet with Tensorflow Update 15.06.2016 I revised the entire code base to work with the new input pipeline coming with TensorFlow = versio

Frederik Kratzert 766 Jan 04, 2023
Official repository for "Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems"

Action-Based Conversations Dataset (ABCD) This respository contains the code and data for ABCD (Chen et al., 2021) Introduction Whereas existing goal-

ASAPP Research 49 Oct 09, 2022
BrainGNN - A deep learning model for data-driven discovery of functional connectivity

A deep learning model for data-driven discovery of functional connectivity https://doi.org/10.3390/a14030075 Usman Mahmood, Zengin Fu, Vince D. Calhou

Usman Mahmood 3 Aug 28, 2022
DISTIL: Deep dIverSified inTeractIve Learning.

DISTIL: Deep dIverSified inTeractIve Learning. An active/inter-active learning library built on py-torch for reducing labeling costs.

decile-team 110 Dec 06, 2022
SNE-RoadSeg in PyTorch, ECCV 2020

SNE-RoadSeg Introduction This is the official PyTorch implementation of SNE-RoadSeg: Incorporating Surface Normal Information into Semantic Segmentati

242 Dec 20, 2022
Development of IP code based on VIPs and AADM

Sparse Implicit Processes In this repository we include the two different versions of the SIP code developed for the article Sparse Implicit Processes

1 Aug 22, 2022