This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

Last update: Dec 28, 2022

Related tags

Deep Learning omnimatte

Overview

Omnimatte in PyTorch

This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

Prerequisites

Linux
Python 3.6+
NVIDIA GPU + CUDA CuDNN

Installation

This code has been tested with PyTorch 1.8 and Python 3.8.

Install PyTorch 1.8 and other dependencies.
- For pip users, please type the command pip install -r requirements.txt.
- For Conda users, you can create a new Conda environment using conda env create -f environment.yml.

Demo

To train a model on a video (e.g. "tennis"), run:

python train.py --name tennis --dataroot ./datasets/tennis --gpu_ids 0,1

To view training results and loss plots, visit the URL http://localhost:8097. Intermediate results are also at ./checkpoints/tennis/web/index.html.

To save the omnimatte layer outputs of the trained model, run:

python test.py --name tennis --dataroot ./datasets/tennis --gpu_ids 0

The results (RGBA layers, videos) will be saved to ./results/tennis/test_latest/.

Custom video

To train on your own video, you will have to preprocess the data:

Extract the frames, e.g.

mkdir ./datasets/my_video && cd ./datasets/my_video 
mkdir rgb && ffmpeg -i video.mp4 rgb/%04d.png

Resize the video to 256x448 and save the frames in my_video/rgb.
Get input object masks (e.g. using Mask-RCNN and STM), save each object's masks in its own subdirectory, e.g. my_video/mask/01/, my_video/mask/02/, etc.
Compute flow (e.g. using RAFT), and save the forward .flo files to my_video/flow and backward flow to my_video/flow_backward
Compute the confidence maps from the forward/backward flows:
```
python datasets/confidence.py --dataroot ./datasets/tennis
```
Register the video and save the computed homographies in my_video/homographies.txt. See here for details.

Note: Videos that are suitable for our method have the following attributes:

Static camera or limited camera motion that can be represented with a homography.
Limited number of omnimatte layers, due to GPU memory limitations. We tested up to 6 layers.
Objects that move relative to the background (static objects will be absorbed into the background layer).
We tested a video length of up to 200 frames (~7 seconds).

Citation

If you use this code for your research, please cite the following paper:

@inproceedings{lu2021,
  title={Omnimatte: Associating Objects and Their Effects in Video},
  author={Lu, Erika and Cole, Forrester and Dekel, Tali and Zisserman, Andrew and Freeman, William T and Rubinstein, Michael},
  booktitle={CVPR},
  year={2021}
}

Acknowledgments

This code is based on retiming and pytorch-CycleGAN-and-pix2pix.

This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

Related tags

Overview

Omnimatte in PyTorch

Prerequisites

Installation

Demo

Custom video

Citation

Acknowledgments

Owner

Erika Lu

TensorFlow implementation of Deep Reinforcement Learning papers

mmdetection version of TinyBenchmark.

Nodule Generation Algorithm Baseline and template code for node21 generation track

Udacity's CS101: Intro to Computer Science - Building a Search Engine

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

Collective Multi-type Entity Alignment Between Knowledge Graphs (WWW'20)

Extracts data from the database for a graph-node and stores it in parquet files

Centroid-UNet is deep neural network model to detect centroids from satellite images.

Visual Memorability for Robotic Interestingness via Unsupervised Online Learning (ECCV 2020 Oral and TRO)

Official Implementation of SWAGAN: A Style-based Wavelet-driven Generative Model

Trading Gym is an open source project for the development of reinforcement learning algorithms in the context of trading.

The FIRST GANs-based omics-to-omics translation framework

Official repository of "BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment"

[NeurIPS 2021] Source code for the paper "Qu-ANTI-zation: Exploiting Neural Network Quantization for Achieving Adversarial Outcomes"

Projecting interval uncertainty through the discrete Fourier transform

pyspark🍒🥭 is delicious，just eat it!😋😋

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

Official PyTorch implementation of Less is More: Pay Less Attention in Vision Transformers.

Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization

Simulation of Self Driving Car