Dynamic View Synthesis from Dynamic Monocular Video

Overview

Dynamic View Synthesis from Dynamic Monocular Video

arXiv

Project Website | Video | Paper

Dynamic View Synthesis from Dynamic Monocular Video
Chen Gao, Ayush Saraf, Johannes Kopf, Jia-Bin Huang
in ICCV 2021

Setup

The code is test with

  • Linux (tested on CentOS Linux release 7.4.1708)
  • Anaconda 3
  • Python 3.7.11
  • CUDA 10.1
  • 1 V100 GPU

To get started, please create the conda environment dnerf by running

conda create --name dnerf
conda activate dnerf
conda install pytorch=1.6.0 torchvision=0.7.0 cudatoolkit=10.1 matplotlib tensorboard scipy opencv -c pytorch
pip install imageio configargparse timm lpips

and install COLMAP manually. Then download MiDaS and RAFT weights

ROOT_PATH=/path/to/the/DynamicNeRF/folder
cd $ROOT_PATH
wget --no-check-certificate https://filebox.ece.vt.edu/~chengao/free-view-video/weights.zip
unzip weights.zip
rm weights.zip

Dynamic Scene Dataset

The Dynamic Scene Dataset is used to quantitatively evaluate our method. Please download the pre-processed data by running:

cd $ROOT_PATH
wget --no-check-certificate https://filebox.ece.vt.edu/~chengao/free-view-video/data.zip
unzip data.zip
rm data.zip

Training

You can train a model from scratch by running:

cd $ROOT_PATH/
python run_nerf.py --config configs/config_Balloon2.txt

Every 100k iterations, you should get videos like the following examples

The novel view-time synthesis results will be saved in $ROOT_PATH/logs/Balloon2_H270_DyNeRF/novelviewtime. novelviewtime

The reconstruction results will be saved in $ROOT_PATH/logs/Balloon2_H270_DyNeRF/testset. testset

The fix-view-change-time results will be saved in $ROOT_PATH/logs/Balloon2_H270_DyNeRF/testset_view000. testset_view000

The fix-time-change-view results will be saved in $ROOT_PATH/logs/Balloon2_H270_DyNeRF/testset_time000. testset_time000

Rendering from pre-trained models

We also provide pre-trained models. You can download them by running:

cd $ROOT_PATH/
wget --no-check-certificate https://filebox.ece.vt.edu/~chengao/free-view-video/logs.zip
unzip logs.zip
rm logs.zip

Then you can render the results directly by running:

python run_nerf.py --config configs/config_Balloon2.txt --render_only --ft_path $ROOT_PATH/logs/Balloon2_H270_DyNeRF_pretrain/300000.tar

Evaluating our method and others

Our goal is to make the evaluation as simple as possible for you. We have collected the fix-view-change-time results of the following methods:

NeRF
NeRF + t
Yoon et al.
Non-Rigid NeRF
NSFF
DynamicNeRF (ours)

Please download the results by running:

cd $ROOT_PATH/
wget --no-check-certificate https://filebox.ece.vt.edu/~chengao/free-view-video/results.zip
unzip results.zip
rm results.zip

Then you can calculate the PSNR/SSIM/LPIPS by running:

cd $ROOT_PATH/utils
python evaluation.py
PSNR / LPIPS Jumping Skating Truck Umbrella Balloon1 Balloon2 Playground Average
NeRF 20.99 / 0.305 23.67 / 0.311 22.73 / 0.229 21.29 / 0.440 19.82 / 0.205 24.37 / 0.098 21.07 / 0.165 21.99 / 0.250
NeRF + t 18.04 / 0.455 20.32 / 0.512 18.33 / 0.382 17.69 / 0.728 18.54 / 0.275 20.69 / 0.216 14.68 / 0.421 18.33 / 0.427
NR NeRF 20.09 / 0.287 23.95 / 0.227 19.33 / 0.446 19.63 / 0.421 17.39 / 0.348 22.41 / 0.213 15.06 / 0.317 19.69 / 0.323
NSFF 24.65 / 0.151 29.29 / 0.129 25.96 / 0.167 22.97 / 0.295 21.96 / 0.215 24.27 / 0.222 21.22 / 0.212 24.33 / 0.199
Ours 24.68 / 0.090 32.66 / 0.035 28.56 / 0.082 23.26 / 0.137 22.36 / 0.104 27.06 / 0.049 24.15 / 0.080 26.10 / 0.082

Please note:

  1. The numbers reported in the paper are calculated using TF code. The numbers here are calculated using this improved Pytorch version.
  2. In Yoon's results, the first frame and the last frame are missing. To compare with Yoon's results, we have to omit the first frame and the last frame. To do so, please uncomment line 72 and comment line 73 in evaluation.py.
  3. We obtain the results of NSFF and NR NeRF using the official implementation with default parameters.

Train a model on your sequence

  1. Set some paths
ROOT_PATH=/path/to/the/DynamicNeRF/folder
DATASET_NAME=name_of_the_video_without_extension
DATASET_PATH=$ROOT_PATH/data/$DATASET_NAME
  1. Prepare training images and background masks from a video.
cd $ROOT_PATH/utils
python generate_data.py --videopath /path/to/the/video
  1. Use COLMAP to obtain camera poses.
colmap feature_extractor \
--database_path $DATASET_PATH/database.db \
--image_path $DATASET_PATH/images_colmap \
--ImageReader.mask_path $DATASET_PATH/background_mask \
--ImageReader.single_camera 1

colmap exhaustive_matcher \
--database_path $DATASET_PATH/database.db

mkdir $DATASET_PATH/sparse
colmap mapper \
    --database_path $DATASET_PATH/database.db \
    --image_path $DATASET_PATH/images_colmap \
    --output_path $DATASET_PATH/sparse \
    --Mapper.num_threads 16 \
    --Mapper.init_min_tri_angle 4 \
    --Mapper.multiple_models 0 \
    --Mapper.extract_colors 0
  1. Save camera poses into the format that NeRF reads.
cd $ROOT_PATH/utils
python generate_pose.py --dataset_path $DATASET_PATH
  1. Estimate monocular depth.
cd $ROOT_PATH/utils
python generate_depth.py --dataset_path $DATASET_PATH --model $ROOT_PATH/weights/midas_v21-f6b98070.pt
  1. Predict optical flows.
cd $ROOT_PATH/utils
python generate_flow.py --dataset_path $DATASET_PATH --model $ROOT_PATH/weights/raft-things.pth
  1. Obtain motion mask (code adapted from NSFF).
cd $ROOT_PATH/utils
python generate_motion_mask.py --dataset_path $DATASET_PATH
  1. Train a model. Please change expname and datadir in configs/config.txt.
cd $ROOT_PATH/
python run_nerf.py --config configs/config.txt

Explanation of each parameter:

  • expname: experiment name
  • basedir: where to store ckpts and logs
  • datadir: input data directory
  • factor: downsample factor for the input images
  • N_rand: number of random rays per gradient step
  • N_samples: number of samples per ray
  • netwidth: channels per layer
  • use_viewdirs: whether enable view-dependency for StaticNeRF
  • use_viewdirsDyn: whether enable view-dependency for DynamicNeRF
  • raw_noise_std: std dev of noise added to regularize sigma_a output
  • no_ndc: do not use normalized device coordinates
  • lindisp: sampling linearly in disparity rather than depth
  • i_video: frequency of novel view-time synthesis video saving
  • i_testset: frequency of testset video saving
  • N_iters: number of training iterations
  • i_img: frequency of tensorboard image logging
  • DyNeRF_blending: whether use DynamicNeRF to predict blending weight
  • pretrain: whether pre-train StaticNeRF

License

This work is licensed under MIT License. See LICENSE for details.

If you find this code useful for your research, please consider citing the following paper:

@inproceedings{Gao-ICCV-DynNeRF,
    author    = {Gao, Chen and Saraf, Ayush and Kopf, Johannes and Huang, Jia-Bin},
    title     = {Dynamic View Synthesis from Dynamic Monocular Video},
    booktitle = {Proceedings of the IEEE International Conference on Computer Vision},
    year      = {2021}
}

Acknowledgments

Our training code is build upon NeRF, NeRF-pytorch, and NSFF. Our flow prediction code is modified from RAFT. Our depth prediction code is modified from MiDaS.

Owner
Chen Gao
Ph.D. student at Virginia Tech Vision and Learning Lab (@vt-vl-lab). Former intern at Google and Facebook Research.
Chen Gao
Ultra-lightweight human body posture key point CNN model. ModelSize:2.3MB HUAWEI P40 NCNN benchmark: 6ms/img,

Ultralight-SimplePose Support NCNN mobile terminal deployment Based on MXNET(=1.5.1) GLUON(=0.7.0) framework Top-down strategy: The input image is t

223 Dec 27, 2022
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Robust Video Matting (RVM) English | 中文 Official repository for the paper Robust High-Resolution Video Matting with Temporal Guidance. RVM is specific

flow-dev 2 Aug 21, 2022
YOLOX + ROS(1, 2) object detection package

YOLOX + ROS(1, 2) object detection package

Ar-Ray 158 Dec 21, 2022
ThunderGBM: Fast GBDTs and Random Forests on GPUs

Documentations | Installation | Parameters | Python (scikit-learn) interface What's new? ThunderGBM won 2019 Best Paper Award from IEEE Transactions o

Xtra Computing Group 647 Jan 04, 2023
Evaluation Pipeline for our ECCV2020: Journey Towards Tiny Perceptual Super-Resolution.

Journey Towards Tiny Perceptual Super-Resolution Test code for our ECCV2020 paper: https://arxiv.org/abs/2007.04356 Our x4 upscaling pre-trained model

Royson 6 Mar 30, 2022
The ICS Chat System project for NYU Shanghai Fall 2021

ICS_Chat_System [Catenger] This is the ICS Chat System project for NYU Shanghai Fall 2021 Creators: Shavarsh Melikyan, Skyler Chen and Arghya Sarkar,

1 Dec 20, 2021
GUI for a Vocal Remover that uses Deep Neural Networks.

GUI for a Vocal Remover that uses Deep Neural Networks.

4.4k Jan 07, 2023
PyTorch reimplementation of minimal-hand (CVPR2020)

Minimal Hand Pytorch Unofficial PyTorch reimplementation of minimal-hand (CVPR2020). you can also find in youtube or bilibili bare hand youtube or bil

Hao Meng 228 Dec 29, 2022
Classification of ecg datas for disease detection

ecg_classification Classification of ecg datas for disease detection

Atacan ÖZKAN 5 Sep 09, 2022
Customizable RecSys Simulator for OpenAI Gym

gym-recsys: Customizable RecSys Simulator for OpenAI Gym Installation | How to use | Examples | Citation This package describes an OpenAI Gym interfac

Xingdong Zuo 14 Dec 08, 2022
Implementation of Deformable Attention in Pytorch from the paper "Vision Transformer with Deformable Attention"

Deformable Attention Implementation of Deformable Attention from this paper in Pytorch, which appears to be an improvement to what was proposed in DET

Phil Wang 128 Dec 24, 2022
(to be released) [NeurIPS'21] Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs

Higher-Order Transformers Kim J, Oh S, Hong S, Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs, NeurIPS 2021. [arxiv] W

Jinwoo Kim 44 Dec 28, 2022
Go from graph data to a secure and interactive visual graph app in 15 minutes. Batteries-included self-hosting of graph data apps with Streamlit, Graphistry, RAPIDS, and more!

✔️ Linux ✔️ OS X ❌ Windows (#39) Welcome to graph-app-kit Turn your graph data into a secure and interactive visual graph app in 15 minutes! Why This

Graphistry 107 Jan 02, 2023
The fastai book, published as Jupyter Notebooks

English / Spanish / Korean / Chinese / Bengali / Indonesian The fastai book These notebooks cover an introduction to deep learning, fastai, and PyTorc

fast.ai 17k Jan 07, 2023
Short and long time series classification using convolutional neural networks

time-series-classification Short and long time series classification via convolutional neural networks In this project, we present a novel framework f

35 Oct 22, 2022
PyTorch implementation of Off-policy Learning in Two-stage Recommender Systems

Off-Policy-2-Stage This repo provides a PyTorch implementation of the MovieLens experiments for the following paper: Off-policy Learning in Two-stage

Jiaqi Ma 25 Dec 12, 2022
Bytedance Inc. 2.5k Jan 06, 2023
ICS 4u HD project, start before-wards. A curtain shooting game using python.

Touhou-Star-Salvation HDCH ICS 4u HD project, start before-wards. A curtain shooting game using python and pygame. By Jason Li For arts and gameplay,

15 Dec 22, 2022
Implementation of the Paper: "Parameterized Hypercomplex Graph Neural Networks for Graph Classification" by Tuan Le, Marco Bertolini, Frank Noé and Djork-Arné Clevert

Parameterized Hypercomplex Graph Neural Networks (PHC-GNNs) PHC-GNNs (Le et al., 2021): https://arxiv.org/abs/2103.16584 PHM Linear Layer Illustration

Bayer AG 26 Aug 11, 2022
Colossal-AI: A Unified Deep Learning System for Large-Scale Parallel Training

ColossalAI An integrated large-scale model training system with efficient parallelization techniques. arXiv: Colossal-AI: A Unified Deep Learning Syst

HPC-AI Tech 7.9k Jan 08, 2023