Dynamic View Synthesis from Dynamic Monocular Video

Overview

Dynamic View Synthesis from Dynamic Monocular Video

arXiv

Project Website | Video | Paper

Dynamic View Synthesis from Dynamic Monocular Video
Chen Gao, Ayush Saraf, Johannes Kopf, Jia-Bin Huang
in ICCV 2021

Setup

The code is test with

  • Linux (tested on CentOS Linux release 7.4.1708)
  • Anaconda 3
  • Python 3.7.11
  • CUDA 10.1
  • 1 V100 GPU

To get started, please create the conda environment dnerf by running

conda create --name dnerf
conda activate dnerf
conda install pytorch=1.6.0 torchvision=0.7.0 cudatoolkit=10.1 matplotlib tensorboard scipy opencv -c pytorch
pip install imageio configargparse timm lpips

and install COLMAP manually. Then download MiDaS and RAFT weights

ROOT_PATH=/path/to/the/DynamicNeRF/folder
cd $ROOT_PATH
wget --no-check-certificate https://filebox.ece.vt.edu/~chengao/free-view-video/weights.zip
unzip weights.zip
rm weights.zip

Dynamic Scene Dataset

The Dynamic Scene Dataset is used to quantitatively evaluate our method. Please download the pre-processed data by running:

cd $ROOT_PATH
wget --no-check-certificate https://filebox.ece.vt.edu/~chengao/free-view-video/data.zip
unzip data.zip
rm data.zip

Training

You can train a model from scratch by running:

cd $ROOT_PATH/
python run_nerf.py --config configs/config_Balloon2.txt

Every 100k iterations, you should get videos like the following examples

The novel view-time synthesis results will be saved in $ROOT_PATH/logs/Balloon2_H270_DyNeRF/novelviewtime. novelviewtime

The reconstruction results will be saved in $ROOT_PATH/logs/Balloon2_H270_DyNeRF/testset. testset

The fix-view-change-time results will be saved in $ROOT_PATH/logs/Balloon2_H270_DyNeRF/testset_view000. testset_view000

The fix-time-change-view results will be saved in $ROOT_PATH/logs/Balloon2_H270_DyNeRF/testset_time000. testset_time000

Rendering from pre-trained models

We also provide pre-trained models. You can download them by running:

cd $ROOT_PATH/
wget --no-check-certificate https://filebox.ece.vt.edu/~chengao/free-view-video/logs.zip
unzip logs.zip
rm logs.zip

Then you can render the results directly by running:

python run_nerf.py --config configs/config_Balloon2.txt --render_only --ft_path $ROOT_PATH/logs/Balloon2_H270_DyNeRF_pretrain/300000.tar

Evaluating our method and others

Our goal is to make the evaluation as simple as possible for you. We have collected the fix-view-change-time results of the following methods:

NeRF
NeRF + t
Yoon et al.
Non-Rigid NeRF
NSFF
DynamicNeRF (ours)

Please download the results by running:

cd $ROOT_PATH/
wget --no-check-certificate https://filebox.ece.vt.edu/~chengao/free-view-video/results.zip
unzip results.zip
rm results.zip

Then you can calculate the PSNR/SSIM/LPIPS by running:

cd $ROOT_PATH/utils
python evaluation.py
PSNR / LPIPS Jumping Skating Truck Umbrella Balloon1 Balloon2 Playground Average
NeRF 20.99 / 0.305 23.67 / 0.311 22.73 / 0.229 21.29 / 0.440 19.82 / 0.205 24.37 / 0.098 21.07 / 0.165 21.99 / 0.250
NeRF + t 18.04 / 0.455 20.32 / 0.512 18.33 / 0.382 17.69 / 0.728 18.54 / 0.275 20.69 / 0.216 14.68 / 0.421 18.33 / 0.427
NR NeRF 20.09 / 0.287 23.95 / 0.227 19.33 / 0.446 19.63 / 0.421 17.39 / 0.348 22.41 / 0.213 15.06 / 0.317 19.69 / 0.323
NSFF 24.65 / 0.151 29.29 / 0.129 25.96 / 0.167 22.97 / 0.295 21.96 / 0.215 24.27 / 0.222 21.22 / 0.212 24.33 / 0.199
Ours 24.68 / 0.090 32.66 / 0.035 28.56 / 0.082 23.26 / 0.137 22.36 / 0.104 27.06 / 0.049 24.15 / 0.080 26.10 / 0.082

Please note:

  1. The numbers reported in the paper are calculated using TF code. The numbers here are calculated using this improved Pytorch version.
  2. In Yoon's results, the first frame and the last frame are missing. To compare with Yoon's results, we have to omit the first frame and the last frame. To do so, please uncomment line 72 and comment line 73 in evaluation.py.
  3. We obtain the results of NSFF and NR NeRF using the official implementation with default parameters.

Train a model on your sequence

  1. Set some paths
ROOT_PATH=/path/to/the/DynamicNeRF/folder
DATASET_NAME=name_of_the_video_without_extension
DATASET_PATH=$ROOT_PATH/data/$DATASET_NAME
  1. Prepare training images and background masks from a video.
cd $ROOT_PATH/utils
python generate_data.py --videopath /path/to/the/video
  1. Use COLMAP to obtain camera poses.
colmap feature_extractor \
--database_path $DATASET_PATH/database.db \
--image_path $DATASET_PATH/images_colmap \
--ImageReader.mask_path $DATASET_PATH/background_mask \
--ImageReader.single_camera 1

colmap exhaustive_matcher \
--database_path $DATASET_PATH/database.db

mkdir $DATASET_PATH/sparse
colmap mapper \
    --database_path $DATASET_PATH/database.db \
    --image_path $DATASET_PATH/images_colmap \
    --output_path $DATASET_PATH/sparse \
    --Mapper.num_threads 16 \
    --Mapper.init_min_tri_angle 4 \
    --Mapper.multiple_models 0 \
    --Mapper.extract_colors 0
  1. Save camera poses into the format that NeRF reads.
cd $ROOT_PATH/utils
python generate_pose.py --dataset_path $DATASET_PATH
  1. Estimate monocular depth.
cd $ROOT_PATH/utils
python generate_depth.py --dataset_path $DATASET_PATH --model $ROOT_PATH/weights/midas_v21-f6b98070.pt
  1. Predict optical flows.
cd $ROOT_PATH/utils
python generate_flow.py --dataset_path $DATASET_PATH --model $ROOT_PATH/weights/raft-things.pth
  1. Obtain motion mask (code adapted from NSFF).
cd $ROOT_PATH/utils
python generate_motion_mask.py --dataset_path $DATASET_PATH
  1. Train a model. Please change expname and datadir in configs/config.txt.
cd $ROOT_PATH/
python run_nerf.py --config configs/config.txt

Explanation of each parameter:

  • expname: experiment name
  • basedir: where to store ckpts and logs
  • datadir: input data directory
  • factor: downsample factor for the input images
  • N_rand: number of random rays per gradient step
  • N_samples: number of samples per ray
  • netwidth: channels per layer
  • use_viewdirs: whether enable view-dependency for StaticNeRF
  • use_viewdirsDyn: whether enable view-dependency for DynamicNeRF
  • raw_noise_std: std dev of noise added to regularize sigma_a output
  • no_ndc: do not use normalized device coordinates
  • lindisp: sampling linearly in disparity rather than depth
  • i_video: frequency of novel view-time synthesis video saving
  • i_testset: frequency of testset video saving
  • N_iters: number of training iterations
  • i_img: frequency of tensorboard image logging
  • DyNeRF_blending: whether use DynamicNeRF to predict blending weight
  • pretrain: whether pre-train StaticNeRF

License

This work is licensed under MIT License. See LICENSE for details.

If you find this code useful for your research, please consider citing the following paper:

@inproceedings{Gao-ICCV-DynNeRF,
    author    = {Gao, Chen and Saraf, Ayush and Kopf, Johannes and Huang, Jia-Bin},
    title     = {Dynamic View Synthesis from Dynamic Monocular Video},
    booktitle = {Proceedings of the IEEE International Conference on Computer Vision},
    year      = {2021}
}

Acknowledgments

Our training code is build upon NeRF, NeRF-pytorch, and NSFF. Our flow prediction code is modified from RAFT. Our depth prediction code is modified from MiDaS.

Owner
Chen Gao
Ph.D. student at Virginia Tech Vision and Learning Lab (@vt-vl-lab). Former intern at Google and Facebook Research.
Chen Gao
Datasets, Transforms and Models specific to Computer Vision

vision Datasets, Transforms and Models specific to Computer Vision Installation First install the nightly version of OneFlow python3 -m pip install on

OneFlow 68 Dec 07, 2022
Pairwise Learning for Neural Link Prediction for OGB (PLNLP-OGB)

Pairwise Learning for Neural Link Prediction for OGB (PLNLP-OGB) This repository provides evaluation codes of PLNLP for OGB link property prediction t

Zhitao WANG 31 Oct 10, 2022
NeurIPS 2021 paper 'Representation Learning on Spatial Networks' code

Representation Learning on Spatial Networks This repository is the official implementation of Representation Learning on Spatial Networks. Training Ex

13 Dec 29, 2022
Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [2021]

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations This repo contains the Pytorch implementation of our paper: Revisit

Wouter Van Gansbeke 80 Nov 20, 2022
Official PyTorch implementation of the paper "Deep Constrained Least Squares for Blind Image Super-Resolution", CVPR 2022.

Deep Constrained Least Squares for Blind Image Super-Resolution [Paper] This is the official implementation of 'Deep Constrained Least Squares for Bli

MEGVII Research 141 Dec 30, 2022
🍅🍅🍅YOLOv5-Lite: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 1.7M (int8) and 3.3M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size is 320×320~

YOLOv5-Lite:lighter, faster and easier to deploy Perform a series of ablation experiments on yolov5 to make it lighter (smaller Flops, lower memory, a

pogg 1.5k Jan 05, 2023
AdaDM: Enabling Normalization for Image Super-Resolution

AdaDM AdaDM: Enabling Normalization for Image Super-Resolution. You can apply BN, LN or GN in SR networks with our AdaDM. Pretrained models (EDSR*/RDN

58 Jan 08, 2023
Official code repository for the work: "The Implicit Values of A Good Hand Shake: Handheld Multi-Frame Neural Depth Refinement"

Handheld Multi-Frame Neural Depth Refinement This is the official code repository for the work: The Implicit Values of A Good Hand Shake: Handheld Mul

55 Dec 14, 2022
Object Detection Projekt in GKI WS2021/22

tfObjectDetection Object Detection Projekt with tensorflow in GKI WS2021/22 Docker Container: docker run -it --name --gpus all -v path/to/project:p

Tim Eggers 1 Jul 18, 2022
Learning Skeletal Articulations with Neural Blend Shapes

This repository provides an end-to-end library for automatic character rigging and blend shapes generation as well as a visualization tool. It is based on our work Learning Skeletal Articulations wit

Peizhuo 504 Dec 30, 2022
Pytorch implementation for the paper: Contrastive Learning for Cold-start Recommendation

Contrastive Learning for Cold-start Recommendation This is our Pytorch implementation for the paper: Yinwei Wei, Xiang Wang, Qi Li, Liqiang Nie, Yan L

45 Dec 13, 2022
Official repo for QHack—the quantum machine learning hackathon

Note: This repository has been frozen while we consider the submissions for the QHack Open Hackathon. We hope you enjoyed the event! Welcome to QHack,

Xanadu 118 Jan 05, 2023
Some methods for comparing network representations in deep learning and neuroscience.

Generalized Shape Metrics on Neural Representations In neuroscience and in deep learning, quantifying the (dis)similarity of neural representations ac

Alex Williams 45 Dec 27, 2022
Official Repository for the paper "Improving Baselines in the Wild".

iWildCam and FMoW baselines (WILDS) This repository was originally forked from the official repository of WILDS datasets (commit 7e103ed) For general

Kazuki Irie 3 Nov 24, 2022
[AAAI-2022] Official implementations of MCL: Mutual Contrastive Learning for Visual Representation Learning

Mutual Contrastive Learning for Visual Representation Learning This project provides source code for our Mutual Contrastive Learning for Visual Repres

winycg 48 Jan 02, 2023
Image super-resolution through deep learning

srez Image super-resolution through deep learning. This project uses deep learning to upscale 16x16 images by a 4x factor. The resulting 64x64 images

David Garcia 5.3k Dec 28, 2022
NeuralDiff: Segmenting 3D objects that move in egocentric videos

NeuralDiff: Segmenting 3D objects that move in egocentric videos Project Page | Paper + Supplementary | Video About This repository contains the offic

Vadim Tschernezki 14 Dec 05, 2022
Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

Learning-Action-Completeness-from-Points Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal A

Pilhyeon Lee 67 Jan 03, 2023
Learning Continuous Signed Distance Functions for Shape Representation

DeepSDF This is an implementation of the CVPR '19 paper "DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation" by Park et a

Meta Research 1.1k Jan 01, 2023