Open source repository for the code accompanying the paper 'Non-Rigid Neural Radiance Fields Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video'.

Overview

Non-Rigid Neural Radiance Fields

This is the official repository for the project "Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video" (NR-NeRF). We extend NeRF, a state-of-the-art method for photorealistic appearance and geometry reconstruction of a static scene, to deforming/non-rigid scenes. For details, we refer to the preprint and the project page, which also includes supplemental videos.

Pipeline figure

Getting Started

Installation

  • Clone this repository.
  • Setup the conda environment nrnerf (or install the requirements using pip):
conda env create -f environment.yml
  • (Optional) For data loading and camera parameter estimation, we have included a dummy implementation that only works on the included example sequence. If you do not want to write your own implementation as specified at the end of this README, you can instead use the following programs and files:
    • Install COLMAP.
    • From nerf-pytorch, use load_llff.py to replace the example version included in this repo.
      • In load_llff_data(), replace sc = 1. if bd_factor is None else 1./(bds.min() * bd_factor) with sc = 1./(bds.max() - bds.min())
    • From LLFF, copy from llff/poses/ the three files colmap_read_model.py, colmap_wrapper.py, and pose_utils.py directly into ./llff_preprocessing (replacing existing files).
      • In pose_utils.py fix the imports by:
        • Commenting out import skimage.transform,
        • Replacing from llff.poses.colmap_wrapper import run_colmap with from .colmap_wrapper import run_colmap,
        • Replacing import llff.poses.colmap_read_model as read_model with from . import colmap_read_model as read_model.
  • (Optional) An installation of FFMPEG enables automatic video generation from images and frame extraction from video input.
conda install -c conda-forge ffmpeg

Walkthrough With an Example Sequence

Having set up the environment, we now show an example that starts with a folder of just images and ends up with a fixed viewpoint re-rendering of the sequence. Please read the sections after this one for details on each step and how to adapt the pipeline to other sequences.

We first navigate into the parent folder (where train.py etc. lie) and activate the conda environment:

conda activate nrnerf

(Preprocess) We then determine the camera parameters:

python preprocess.py --input data/example_sequence/

(Training) Next, we train the model with the scene-specific config:

python train.py --config configs/example_sequence.txt

(Free Viewpoint Rendering) Finally, we synthesize a novel camera path:

python free_viewpoint_rendering.py --input experiments/experiment_1/ --deformations train --camera_path fixed --fixed_view 10

All results will be in the same folder, experiments/experiment_1/output/train_fixed_10/.

Overall, the input video (left) is re-rendered into a fixed novel view (right):

Novel view synthesis result on example sequence

Convenience Features

  • Works with video file input,
  • Script for lens distortion estimation and undistortion of input files,
  • Automatic multi-GPU support (torch.nn.DataParallel),
  • Automatically continues training if previous training detected,
  • Some modifications to lessen GPU memory requirements and to speed-up loading at the start of training.

Practical Tips for Recording Scenes

As this is a research project, it is not sufficiently robust to work on arbitrary scenes. Here are some tips to consider when recordings new scenes:

  • Sequences should have lengths of about 100-300 frames. More frames require longer training.
  • Avoid blur (e.g., motion blur or out-of-focus blur).
  • Keep camera settings like color temperature and focal length fixed.
  • Avoid lens distortions or estimate distortion parameters for undistortion.
  • Stick to front-facing camera paths that capture most of the scene in all images.
  • Use sufficient lighting and avoid changing it while recording.
  • Avoid self-shadowing.
  • Only record Lambertian surfaces, avoid view-dependent effects like specularities (view-dependent effects can be activated by setting use_viewdirs=True).
  • The background needs to be static and dominant enough for SfM to estimate extrinsics.
  • Limited scene size: Ensure that the background is not more than an order of magnitude further from the camera compared to the non-rigid foreground.

Using the Code

Preprocess

Determining Camera Parameters

Before we can train a network on a newly recorded sequence, we need to estimate its camera parameters (extrinsics and intrinsics).

The preprocessing code assumes the folder structure PARENT_FOLDER/images/IMAGE_NAME1.png. To determine the camera parameters for such a sequence, please run

python preprocess.py --input PARENT_FOLDER

The --output OUTPUT_FOLDER option allows to set a custom output folder, otherwise PARENT_FOLDER is used by default.

(Optional) Lens Distortion Estimation and Image Undistortion

While not necessary for decent results with most camera lenses, the preprocessing code allows to estimate lens distortions from a checkerboard/chessboard sequence and to then use the estimated distortion parameters to undistort input sequences recorded with the same camera.

First, record a checkerboard sequence and run the following command to estimate lens distortion parameters from it:

python preprocess.py --calibrate_lens_distortion --input PARENT_FOLDER --checkerboard_width WIDTH --checkerboard_height HEIGHT

The calibration code uses OpenCV. HEIGHT and WIDTH refer to the number of squares, not to lengths. The optional flags --visualize_detections and --undistort_calibration_images might help with determining issues with the calibration process, see preprocess.py for details.

Then, in order to undistort an input sequence using the computed parameters, simply add --undistort_with_calibration_file PATH_TO_LENS_DISTORTION_JSON when preprocessing the sequence using preprocess.py as described under Determining Camera Parameters.

(Optional) Video Input

In addition to image files, the preprocessing code in preprocess.py also supports video input. Simply set --input to the video file.

This requires an installation of ffmpeg. The --ffmpeg_path PATH_TO_EXECUTABLE option allows to set a custom path to an ffmpeg executable.

The --fps 10 option can be used to modify the framerate at which images are extracted from the video. The default is 5.

Training

The config file default.txt needs to be modified as follows:

  • rootdir: An output folder that collects all experiments (i.e. multiple trainings)
  • datadir: Recorded input sequence. Set to PARENT_FOLDER from the Preprocess section above
  • expname: Name of this experiment. Output will be written to rootdir/expname/

Other relevant parameters are:

  • offsets_loss_weight, divergence_loss_weight, rigidity_loss_weight: Weights for loss terms. Need to be tuned for each scene, see the preprint for details.
  • factor: Downsamples the input sequence by factor before training on it.
  • use_viewdirs: Set to True to activate view-dependent effects. Note that this slows down training by about 20% (approximate) or 35% (exact) on a V100 GPU.
  • approx_nonrigid_viewdirs: True uses a fast finite difference approximation of the view direction, False computes the exact direction.

Finally, start the training by running:

python train.py

A custom config file can optionally be passed via --config CONFIG_FILE.

The train_block_size and test_block_size options allow to split the images into training and test blocks. The scheme is AAAAABBAAAAABBAAA for train_block_size=5 and test_block_size=2. Note that optimizing for the latent codes of test images slows down training by about 30% (relative to only using training images) due to an additional backwards pass.

If a previous version of the experiment exists, train.py will automatically continue training from it. To prevent that, pass the --no_reload flag.

Free Viewpoint Rendering

Once we've trained a network, we can render it into novel views.

The following arguments are mandatory:

  • input: Set to the folder of the trained network, i.e. rootdir/expname/
  • deformations: Set to the subset of the deformations/images that are to be used. Can be train, test, or all
  • camera_path: Possible camera paths are: input_recontruction, fixed, and spiral.

Then, we can synthesize novel views by running:

python free_viewpoint_rendering.py --input INPUT --deformations train --camera_path fixed

The fixed camera view uses the first input view by default. This can be set to another index (e.g. 5) with --fixed_view 5.

Furthermore, the forced background stabilization described in the preprint can be used by passing a threshold via the --forced_background_stabilization 0.01 option. The canonical model (without any ray bending applied) can be rendered by setting the --render_canonical flag. Finally, the framerate of the generated output videos can be set with --output_video_fps 5.

For automatic video generation, please install ffmpeg.

(Optional) Adaptive Spiral Camera Path

It is also possible to use a spiral camera path that adapts to the length of the video. If you do not want to implement such a path yourself, you can copy and modify the else branch in load_llff_data of load_llff.py. You can find a recommended wrapper in free_viewpoint_rendering: _spiral_poses. Set N_views to num_poses. We recommend multiplying rads in render_path_spiral right before the for loop by 0.5.

Cite

When using this code, please cite our preprint Tretschk et al.: Non-Rigid Neural Radiance Fields as well as the following works on which it builds:

@misc{tretschk2020nonrigid,
      title={Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video},
      author={Edgar Tretschk and Ayush Tewari and Vladislav Golyanik and Michael Zollhöfer and Christoph Lassner and Christian Theobalt},
      year={2020},
      eprint={2012.12247},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
@misc{lin2020nerfpytorch,
  title={NeRF-pytorch},
  author={Yen-Chen, Lin},
  howpublished={\url{https://github.com/yenchenlin/nerf-pytorch/}},
  year={2020}
}
@inproceedings{mildenhall2020nerf,
 title={NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis},
 author={Ben Mildenhall and Pratul P. Srinivasan and Matthew Tancik and Jonathan T. Barron and Ravi Ramamoorthi and Ren Ng},
 year={2020},
 booktitle={ECCV},
}

Specification of Missing Functions

load_llff_data from load_llff.py needs to return a numpy array images of shape N x H x W x 3 with RGB values scaled to lie between 0 and 1, a numpy array poses of shape N x 3 x 5, where poses[:,:,:3] are the camera extrinsic rotations, poses[:,:,3] are the camera extrinsic translations in world units, and poses[:,:,4] are height, width, focal length in pixels at every frame (the same at all N frames), bds is a numpy array containing the depth values of near and far planes in world units (only the maximum and minimum entries of bds matter), render_poses is a numpy array of shape N x 3 x 4 with rotation and translation encoded as for poses, and i_test is an image index. The first argument specifies the directory from which the images should be loaded, and the second argument specifies a downsampling factor that should be applied to the images. The remaining arguments can be ignored.

gen_poses from llff_preprocessing/pose_utils.py should compute and store camera parameters of the images given by the first argument such that the format is compatible with load_llff_data. The second argument can be ignored.

The camera extrinsic translation is in world space. The translations should be scaled such that the overall scene roughly lies in the unit cube. The camera extrinsic rotation is camera-to-world, R * c = w. The camera coordinate system has the x-axis pointing to the right, y up, and z back.

License

This code builds on the PyTorch port by Yen-Chen Lin of the original NeRF code. Both are released under an MIT license. Several functions in run_nerf_helpers.py are modified versions from the FFJORD code, which is released under an MIT license. We thank all of them for releasing their code.

We release this code under an MIT license as well. You can find all licenses in the file LICENSE.

Owner
Facebook Research
Facebook Research
A cross-document event and entity coreference resolution system, trained and evaluated on the ECB+ corpus.

A Comprehensive Comparison of Word Embeddings in Event & Entity Coreference Resolution. Introduction This repo contains experimental code derived from

2 May 09, 2022
Predicting Student Attentiveness using OpenCV

Predicting-Student-Attentiveness-using-OpenCV The model will predict if a student is attentive or not through facial parameter received through the st

Johann Pinto 2 Aug 20, 2022
A tensorflow implementation of GCN-LPA

GCN-LPA This repository is the implementation of GCN-LPA (arXiv): Unifying Graph Convolutional Neural Networks and Label Propagation Hongwei Wang, Jur

Hongwei Wang 83 Nov 28, 2022
Tgbox-bench - Simple TGBOX upload speed benchmark

TGBOX Benchmark This script will benchmark upload speed to TGBOX storage. Build

Non 1 Jan 09, 2022
Setup freqtrade/freqUI on Heroku

UNMAINTAINED - REPO MOVED TO https://github.com/p-zombie/freqtrade Creating the app git clone https://github.com/joaorafaelm/freqtrade.git && cd freqt

João 51 Aug 29, 2022
DI-HPC is an acceleration operator component for general algorithm modules in reinforcement learning algorithms

DI-HPC: Decision Intelligence - High Performance Computation DI-HPC is an acceleration operator component for general algorithm modules in reinforceme

OpenDILab 185 Dec 29, 2022
StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators

StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators [Project Website] [Replicate.ai Project] StyleGAN-NADA: CLIP-Guided Domain Adaptation

992 Dec 30, 2022
[ICCV 2021 Oral] Mining Latent Classes for Few-shot Segmentation

Mining Latent Classes for Few-shot Segmentation Lihe Yang, Wei Zhuo, Lei Qi, Yinghuan Shi, Yang Gao. This codebase contains baseline of our paper Mini

Lihe Yang 66 Nov 29, 2022
Code for our WACV 2022 paper "Hyper-Convolution Networks for Biomedical Image Segmentation"

Hyper-Convolution Networks for Biomedical Image Segmentation Code for our WACV 2022 paper "Hyper-Convolution Networks for Biomedical Image Segmentatio

Tianyu Ma 17 Nov 02, 2022
Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

Learning-Action-Completeness-from-Points Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal A

Pilhyeon Lee 67 Jan 03, 2023
Locally cache assets that are normally streamed in POPULATION: ONE

Population One Localizer This is no longer needed as of the build shipped on 03/03/22, thank you bigbox :) Locally cache assets that are normally stre

Ahman Woods 2 Mar 04, 2022
This is the official implementation of VaxNeRF (Voxel-Accelearated NeRF).

VaxNeRF Paper | Google Colab This is the official implementation of VaxNeRF (Voxel-Accelearated NeRF). This codebase is implemented using JAX, buildin

naruya 132 Nov 21, 2022
Ejemplo Algoritmo Viterbi - Example of a Viterbi algorithm applied to a hidden Markov model on DNA sequence

Ejemplo Algoritmo Viterbi Ejemplo de un algoritmo Viterbi aplicado a modelo ocul

Mateo Velásquez Molina 1 Jan 10, 2022
Neural Magic Eye: Learning to See and Understand the Scene Behind an Autostereogram, arXiv:2012.15692.

Neural Magic Eye Preprint | Project Page | Colab Runtime Official PyTorch implementation of the preprint paper "NeuralMagicEye: Learning to See and Un

Zhengxia Zou 56 Jul 15, 2022
Anomaly Detection Based on Hierarchical Clustering of Mobile Robot Data

We proposed a new approach to detect anomalies of mobile robot data. We investigate each data seperately with two clustering method hierarchical and k-means. There are two sub-method that we used for

Zekeriyya Demirci 1 Jan 09, 2022
Reproducing code of hair style replacement method from Barbershorp.

Barbershorp Reproducing code of hair style replacement method from Barbershorp. Also reproduces II2S, an improved version of Image2StyleGAN. Requireme

1 Dec 24, 2021
用opencv的dnn模块做yolov5目标检测,包含C++和Python两个版本的程序

yolov5-dnn-cpp-py yolov5s,yolov5l,yolov5m,yolov5x的onnx文件在百度云盘下载, 链接:https://pan.baidu.com/s/1d67LUlOoPFQy0MV39gpJiw 提取码:bayj python版本的主程序是main_yolov5.

365 Jan 04, 2023
Official PyTorch implementation of the paper "Self-Supervised Relational Reasoning for Representation Learning", NeurIPS 2020 Spotlight.

Official PyTorch implementation of the paper: "Self-Supervised Relational Reasoning for Representation Learning" (2020), Patacchiola, M., and Storkey,

Massimiliano Patacchiola 135 Jan 03, 2023
piSTAR Lab is a modular platform built to make AI experimentation accessible and fun. (pistar.ai)

piSTAR Lab WARNING: This is an early release. Overview piSTAR Lab is a modular deep reinforcement learning platform built to make AI experimentation a

piSTAR Lab 0 Aug 01, 2022
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation Demo video: CVPR 2021 Oral: Single Channel Manipulation: Localized or attribu

Zongze Wu 267 Dec 30, 2022