Demo code for paper "Learning optical flow from still images", CVPR 2021.

Last update: Dec 25, 2022

Related tags

Overview

Depthstillation

Demo code for "Learning optical flow from still images", CVPR 2021.

[Project page] - [Paper] - [Supplementary]

This code is provided to replicate the qualitative results shown in the supplementary material, Sections 2-4. The code has been tested using Ubuntu 20.04 LTS, python 3.8 and gcc 9.3.0

Reference

If you find this code useful, please cite our work:

@inproceedings{Aleotti_CVPR_2021,
  title     = {Learning optical flow from still images},
  author    = {Aleotti, Filippo and
               Poggi, Matteo and
               Mattoccia, Stefano},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2021}
}

Introduction
Usage
Supplementary
Weights
Contacts
Acknowledgments

Introduction

This paper deals with the scarcity of data for training optical flow networks, highlighting the limitations of existing sources such as labeled synthetic datasets or unlabeled real videos. Specifically, we introduce a framework to generate accurate ground-truth optical flow annotations quickly and in large amounts from any readily available single real picture. Given an image, we use an off-the-shelf monocular depth estimation network to build a plausible point cloud for the observed scene. Then, we virtually move the camera in the reconstructed environment with known motion vectors and rotation angles, allowing us to synthesize both a novel view and the corresponding optical flow field connecting each pixel in the input image to the one in the new frame. When trained with our data, state-of-the-art optical flow networks achieve superior generalization to unseen real data compared to the same models trained either on annotated synthetic datasets or unlabeled videos, and better specialization if combined with synthetic images.

Usage

Install the project requirements in a new python 3 environment:

virtualenv -p python3 learning_flow_env
source learning_flow_env/bin/activate
pip install -r requirements.txt

Compile the forward_warping module, written in C (required to handle warping collisions):

cd external/forward_warping
bash compile.sh
cd ../..

You are now ready to run the depthstillation.py script:

python depthstillation.py

By switching some parameters you can generate all the qualitatives provided in the supplementary material.

These parameters are:

num_motions: changes the number of virtual motions
segment: enables instance segmentation (for independently moving objects)
mask_type: mask selection. Options are H' and H
num_objects: sets the number of independently moving objects (one, in this example)
no_depth: disables monocular depth and force depth to assume a constant value
no_sharp: disables depth sharpening
change_k: uses different intrinsics K
change_motion: samples a different motion (ignored if num_motions greater than 1)

For instance, to simulate a different K settings, just run:

python depthstillation.py --change_k

The results are saved in dCOCO folder, organized as follows:

depth_color: colored depth map
flow: generated flow labels (in 16bit KITTI format)
flow_color: colored flow labels
H: H mask
H': H' mask
im0: real input image
im1: generated virtual image
im1_raw: generated virtual image (pre-inpainting)
instances_color: colored instance map (if --segment is enabled)
M: M mask
M': M' mask
P: P mask

We report the list of files used to depthstill dCOCO in samples/dCOCO_file_list.txt

Supplementary

We report here the list of commands to obtain, in the same order, the Figures shown in Sections 2-4 of the Supplementary Material:

Section 2 -- the first figure is obtained with default parameters, then we use --no_depth and --no_depth --segment respectively
Section 3 -- the first figure is obtained with --no_sharp, the remaining figures with default parameters or by setting --mask_type "H".
Section 4 -- we show three times the results obtained by default parameters, followed respectively by figures generated using --change_k, --change_motion and --segment individually.

Weights

We provide RAFT models trained in our experiments. To run them and reproduce our results, please refer to RAFT repository:

Tab. 4 (C) dCOCO (D) Ch->Th->dCOCO
Tab. 5 (C) dCOCO (fine-tuned) (D) Ch->Th->dCOCO (fine-tuned)
Tab. 7 (C) dDAVIS
Tab. 8 (C) dKITTI

Contacts

m [dot] poggi [at] unibo [dot] it

Acknowledgments

Thanks to Clément Godard and Niantic for sharing monodepth2 code, used to simulate camera motion.

Our work is inspired by Jamie Watson et al., Learning Stereo from Single Images.

Demo code for paper "Learning optical flow from still images", CVPR 2021.

Related tags

Overview

Depthstillation

Reference

Contents

Introduction

Usage

Supplementary

Weights

Contacts

Acknowledgments

Owner

A transformer-based method for Healthcare Image Captioning in Vietnamese

Codes for "CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation"

A GUI to automatically create a TOPAS-readable MLC simulation file

Reimplement of SimSwap training code

Official PyTorch implementation of paper: Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation (ICCV 2021 Oral Presentation)

Collision risk estimation using stochastic motion models

A task-agnostic vision-language architecture as a step towards General Purpose Vision

Fast Neural Representations for Direct Volume Rendering

Implicit Model Specialization through DAG-based Decentralized Federated Learning

A fuzzing framework for SMT solvers

PyTorch-based framework for Deep Hedging

This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

The project of phase's key role in complex and real NN

Code and Resources for the Transformer Encoder Reasoning Network (TERN)

PoseCamera is python based SDK for human pose estimation through RGB webcam.

An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

code for CVPR paper Zero-shot Instance Segmentation

CVPR 2020 oral paper: Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax.

Notspot robot simulation - Python version

STARCH compuets regional extreme storm physical characteristics and moisture balance based on spatiotemporal precipitation data from reanalysis or climate model data.

Demo code for paper "Learning optical flow from still images", CVPR 2021.

Related tags

Overview

Depthstillation

Reference

Contents

Introduction

Usage

Supplementary

Weights

Contacts

Acknowledgments

Owner

A transformer-based method for Healthcare Image Captioning in Vietnamese

Codes for "CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation"

A GUI to automatically create a TOPAS-readable MLC simulation file

Reimplement of SimSwap training code

Official PyTorch implementation of paper: Standardized Max Logits: A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban-Scene Segmentation (ICCV 2021 Oral Presentation)

Collision risk estimation using stochastic motion models

A task-agnostic vision-language architecture as a step towards General Purpose Vision

Fast Neural Representations for Direct Volume Rendering

Implicit Model Specialization through DAG-based Decentralized Federated Learning

A fuzzing framework for SMT solvers

PyTorch-based framework for Deep Hedging

​ This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

The project of phase's key role in complex and real NN

Code and Resources for the Transformer Encoder Reasoning Network (TERN)

PoseCamera is python based SDK for human pose estimation through RGB webcam.

An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

code for CVPR paper Zero-shot Instance Segmentation

CVPR 2020 oral paper: Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax.

Notspot robot simulation - Python version

STARCH compuets regional extreme storm physical characteristics and moisture balance based on spatiotemporal precipitation data from reanalysis or climate model data.

This is the Pytorch implementation of Progressive Attentional Manifold Alignment.