[ ICCV 2021 Oral ] Our method can estimate camera poses and neural radiance fields jointly when the cameras are initialized at random poses in complex scenarios (outside-in scenes, even with less texture or intense noise )

Overview

GNeRF

This repository contains official code for the ICCV 2021 paper: GNeRF: GAN-based Neural Radiance Field without Posed Camera. This implementation is written in Pytorch.

architecture

Abstract

We introduce GNeRF, a framework to marry Generative Adversarial Networks (GAN) with Neural Radiance Field (NeRF) reconstruction for the complex scenarios with unknown and even randomly initialized camera poses. Recent NeRF-based advances have gained popularity for remarkable realistic novel view synthesis. However, most of them heavily rely on accurate camera poses estimation, while few recent methods can only optimize the unknown camera poses in roughly forward-facing scenes with relatively short camera trajectories and require rough camera poses initialization. Differently, our GNeRF only utilizes randomly initialized poses for complex outside-in scenarios. We propose a novel two-phases end-to-end framework. The first phase takes the use of GANs into the new realm for optimizing coarse camera poses and radiance fields jointly, while the second phase refines them with additional photometric loss. We overcome local minima using a hybrid and iterative optimization scheme. Extensive experiments on a variety of synthetic and natural scenes demonstrate the effectiveness of GNeRF. More impressively, our approach outperforms the baselines favorably in those scenes with repeated patterns or even low textures that are regarded as extremely challenging before.

Installation

We recommand using Anaconda to setup the environment. Run the following commands:

# Create a conda environment named 'gnerf'
conda create --name gnerf python=3.7
# Activate the environment
conda activate gnerf
# Install requirements
pip install -r requirements.txt

Data

Blender

Download from the NeRF official Google Drive . Please download and unzip nerf_synthetic.zip.

DTU

Download the preprocessed DTU training data from original MVSNet repo and unzip. We also provide a few DTU examples for fast testing.

Your own data

We share some advices on preparing your own dataset and setting related parameters:

  • Pose sampling space should be close to the data: Our method requires a reasonable prior pose distribution.
  • The training may fail to converge on symmetrical scenes: The inversion network can not map an image to different poses.

Running

python train.py ./config/CONFIG.yaml --data_dir PATH/TO/DATASET

where you replace CONFIG.yaml with your config file (blender.yaml for blender dataset and dtu.yaml for DTU dataset). You can optionally monitor on the training process using tensorboard by adding --open_tensorboard argument. The default setting takes around 13GB GPU memory. After 40k iterations, you should get a video like these:

Evaluation

python eval.py --ckpt PATH/TO/CKPT.pt --gt PATH/TO/GT.json 

where you replace PATH/TO/CKPT.pt with your trained model checkpoint, and PATH/TO/GT.json with the json file in NeRF-Synthetic dataset. Then, just run the ATE toolbox on the evaluation directory.

List of Possible Improvements

For future work, we recommend the following aspects to further improve the performance and stability:

  • Replace the single NeRF network with mip-NeRF network: The use of separate MLPs in the original NeRF paper is a key detail to represent thin objects in the scene, if you retrain the original NeRF with only one MLP you will find a decrease in performance. While in our work, a single MLP network is necessary to keep the coarse image and fine image aligned. The cone casting and IPE features of mip-NeRF allow it to explicitly encode scale into the input features and thereby enable an MLP to learn a multiscale representation of the scene.

  • Combine BARF to further overcome local minima: The BARF method shows that susceptibility to noise from positional encoding affects the basin of attraction for registration and present a coarse-to-fine registration strategy.

  • Combine NeRF++ to represent the background in real scenes with complex background.

Citation

If you find our code or paper useful, please consider citing

@InProceedings{meng2021gnerf,
    author = {Meng, Quan and Chen, Anpei and Luo, Haimin and Wu, Minye and Su, Hao and Xu, Lan and He, Xuming and Yu, Jingyi},
    title = {{G}{N}e{R}{F}: {G}{A}{N}-based {N}eural {R}adiance {F}ield without {P}osed {C}amera},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    year = {2021}
}

Some code snippets are borrowed from GRAF and nerf_pl. Thanks for these great projects.

Owner
Quan Meng
Quan Meng
A modification of Daniel Russell's notebook merged with Katherine Crowson's hq-skip-net changes

Edits made to this repo by Katherine Crowson I have added several features to this repository for use in creating higher quality generative art (featu

Paul Fishwick 10 May 07, 2022
PyTorch implementation of "Learn to Dance with AIST++: Music Conditioned 3D Dance Generation."

Learn to Dance with AIST++: Music Conditioned 3D Dance Generation. Installation pip install -r requirements.txt Prepare Dataset bash data/scripts/pre

Zj Li 8 Sep 07, 2021
Training neural models with structured signals.

Neural Structured Learning in TensorFlow Neural Structured Learning (NSL) is a new learning paradigm to train neural networks by leveraging structured

955 Jan 02, 2023
git《Investigating Loss Functions for Extreme Super-Resolution》(CVPR 2020) GitHub:

Investigating Loss Functions for Extreme Super-Resolution NTIRE 2020 Perceptual Extreme Super-Resolution Submission. Our method ranked first and secon

Sejong Yang 0 Oct 17, 2022
🏃‍♀️ A curated list about human motion capture, analysis and synthesis.

Awesome Human Motion 🏃‍♀️ A curated list about human motion capture, analysis and synthesis. Contents Introduction Human Models Datasets Data Process

Dennis Wittchen 274 Dec 14, 2022
Get started learning C# with C# notebooks powered by .NET Interactive and VS Code.

.NET Interactive Notebooks for C# Welcome to the home of .NET interactive notebooks for C#! How to Install Download the .NET Coding Pack for VS Code f

.NET Platform 425 Dec 25, 2022
Generalized Data Weighting via Class-level Gradient Manipulation

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

18 Nov 12, 2022
TRIQ implementation

TRIQ Implementation TF-Keras implementation of TRIQ as described in Transformer for Image Quality Assessment. Installation Clone this repository. Inst

Junyong You 115 Dec 30, 2022
Flexible-Modal Face Anti-Spoofing: A Benchmark

Flexible-Modal FAS This is the official repository of "Flexible-Modal Face Anti-

Zitong Yu 22 Nov 10, 2022
PyTorch Lightning + Hydra. A feature-rich template for rapid, scalable and reproducible ML experimentation with best practices. ⚡🔥⚡

Lightning-Hydra-Template A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥 Click on Use this template to initialize new re

Łukasz Zalewski 2.1k Jan 09, 2023
Code of paper "CDFI: Compression-Driven Network Design for Frame Interpolation", CVPR 2021

CDFI (Compression-Driven-Frame-Interpolation) [Paper] (Coming soon...) | [arXiv] Tianyu Ding*, Luming Liang*, Zhihui Zhu, Ilya Zharkov IEEE Conference

Tianyu Ding 95 Dec 04, 2022
Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

ONNX Object Localization Network Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX. Ori

Ibai Gorordo 15 Oct 14, 2022
Official Pytorch Implementation of: "Semantic Diversity Learning for Zero-Shot Multi-label Classification"(2021) paper

Semantic Diversity Learning for Zero-Shot Multi-label Classification Paper Official PyTorch Implementation Avi Ben-Cohen, Nadav Zamir, Emanuel Ben Bar

28 Aug 29, 2022
Face Transformer for Recognition

Face-Transformer This is the code of Face Transformer for Recognition (https://arxiv.org/abs/2103.14803v2). Recently there has been great interests of

Zhong Yaoyao 153 Nov 30, 2022
A TensorFlow implementation of SOFA, the Simulator for OFfline LeArning and evaluation.

SOFA This repository is the implementation of SOFA, the Simulator for OFfline leArning and evaluation. Keeping Dataset Biases out of the Simulation: A

22 Nov 23, 2022
ML models implementation practice

Let's implement various ML algorithms with numpy/tf Vanilla Neural Network https://towardsdatascience.com/lets-code-a-neural-network-in-plain-numpy-ae

Jinsoo Heo 4 Jul 04, 2021
Neural Magic Eye: Learning to See and Understand the Scene Behind an Autostereogram, arXiv:2012.15692.

Neural Magic Eye Preprint | Project Page | Colab Runtime Official PyTorch implementation of the preprint paper "NeuralMagicEye: Learning to See and Un

Zhengxia Zou 56 Jul 15, 2022
一个目标检测的通用框架(不需要cuda编译),支持Yolo全系列(v2~v5)、EfficientDet、RetinaNet、Cascade-RCNN等SOTA网络。

一个目标检测的通用框架(不需要cuda编译),支持Yolo全系列(v2~v5)、EfficientDet、RetinaNet、Cascade-RCNN等SOTA网络。

Haoyu Xu 203 Jan 03, 2023
Time-Optimal Planning for Quadrotor Waypoint Flight

Time-Optimal Planning for Quadrotor Waypoint Flight This is an example implementation of the paper "Time-Optimal Planning for Quadrotor Waypoint Fligh

Robotics and Perception Group 38 Dec 02, 2022
CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image.

CoReNet CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image. It produces coherent reconstructions, where all objec

Google Research 80 Dec 25, 2022