Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Overview

Deep 3D Mask Volume for View Synthesis of Dynamic Scenes

Official PyTorch Implementation of paper "Deep 3D Mask Volume for View Synthesis of Dynamic Scenes", ICCV 2021.

Kai-En Lin1, Lei Xiao2, Feng Liu2, Guowei Yang1, Ravi Ramamoorthi1

1University of California, San Diego, 2Facebook Reality Labs

Project Page | Paper | Supplementary Materials | Pretrained models | Dataset | Preprocessing script

Requirements

Install required packages

Make sure you have up-to-date NVIDIA drivers supporting CUDA 11.1 (10.2 could work but need to change cudatoolkit package accordingly)

Run

conda env create -f environment.yml
conda activate video_viewsynth

Usage

Rendering

  1. Download our pretrained checkpoint and testing data. Extract the content to [path_to_data_directory]. It contains frames and background folders, as well as poses_bounds.npy.

  2. In configs, setup data path by changing render_video.txt

    root_dir should point to the frames folder mentioned in 1. and bg_dir should point to background folder.

    out_dir can be your desired output folder.

    ckpt_path should be the pretrained checkpoint path.

  3. Run python render_llff_video.py --config [config_file_path]

    e.g. python render_llff_video.py --config ../configs/render_video.txt

  • (Optional) For your own data, please run prepare_data.sh

    sh render.sh [frame_folder] [starting_frame] [ending_frame] [output_folder_name]

    Make sure your data is in this structure before running

    [frame_folder] --- cam00 --- 00000.jpg
                    |         |- 00001.jpg
                    |         ...
                    |- cam01
                    |- cam02
                    ...
                    |- poses_bounds.npy
    

    e.g. sh render.sh ~/deep_3d_data/frames 0 20 qual

Training

Train MPI

  1. Download RealEstate10K dataset and extract the frames. There are scripts in preprocessing folder which can be used to generate the data.

    The order should be download_data.py -> extract_frames.py -> compress_data.py.

    Remember to change the path in compress_data.py.

  2. Change the paths in config file train_realestate10k.txt

  3. Run

    cd train_mpi
    python train.py --config ../configs/train_realestate10k.txt
    

Train Mask

Once MPI is trained, we can use the checkpoint to train 3D mask network.

  1. Download dataset

  2. Change the paths in config file train_mask.txt

  3. Run

    cd train_mask
    python train.py --config ../configs/train_mask.txt
    

Citation

@inproceedings {lin2021deep,
    title = {Deep 3D Mask Volume for View Synthesis of Dynamic Scenes},
    author = {Kai-En Lin and Lei Xiao and Feng Liu and Guowei Yang and Ravi Ramamoorthi},
    booktitle = {ICCV},
    year = {2021},
}
Owner
Ken Lin
Ken Lin
Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification This repository is the official implementation of [Dealing With Misspeci

0 Oct 25, 2021
My personal code and solution to the Synacor Challenge from 2012 OSCON.

Synacor OSCON Challenge Solution (2012) This repository contains my code and solution to solve the Synacor OSCON 2012 Challenge. If you are interested

2 Mar 20, 2022
A unified 3D Transformer Pipeline for visual synthesis

Overview This is the official repo for the paper: "NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion". NÜWA is a unified multimodal

Microsoft 2.6k Jan 03, 2023
PyTorch Implementation of Realtime Multi-Person Pose Estimation project.

PyTorch Realtime Multi-Person Pose Estimation This is a pytorch version of Realtime_Multi-Person_Pose_Estimation, origin code is here Realtime_Multi-P

Dave Fang 157 Nov 12, 2022
R interface to fast.ai

R interface to fastai The fastai package provides R wrappers to fastai. The fastai library simplifies training fast and accurate neural nets using mod

113 Dec 20, 2022
Set of models for classifcation of 3D volumes

Classification models 3D Zoo - Keras and TF.Keras This repository contains 3D variants of popular CNN models for classification like ResNets, DenseNet

69 Dec 28, 2022
Source code for the paper: Variance-Aware Machine Translation Test Sets (NeurIPS 2021 Datasets and Benchmarks Track)

Variance-Aware-MT-Test-Sets Variance-Aware Machine Translation Test Sets License See LICENSE. We follow the data licensing plan as the same as the WMT

NLP2CT Lab, University of Macau 5 Dec 21, 2021
A python implementation of Physics-informed Spline Learning for nonlinear dynamics discovery

PiSL A python implementation of Physics-informed Spline Learning for nonlinear dynamics discovery. Sun, F., Liu, Y. and Sun, H., 2021. Physics-informe

Fangzheng (Andy) Sun 8 Jul 13, 2022
Code of Adverse Weather Image Translation with Asymmetric and Uncertainty aware GAN

Adverse Weather Image Translation with Asymmetric and Uncertainty-aware GAN (AU-GAN) Official Tensorflow implementation of Adverse Weather Image Trans

Jeong-gi Kwak 36 Dec 26, 2022
OHLC Average Prediction of Apple Inc. Using LSTM Recurrent Neural Network

Stock Price Prediction of Apple Inc. Using Recurrent Neural Network OHLC Average Prediction of Apple Inc. Using LSTM Recurrent Neural Network Dataset:

Nouroz Rahman 410 Jan 05, 2023
Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time

Semi Hand-Object Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time (CVPR 2021).

96 Dec 27, 2022
The world's largest toxicity dataset.

The Toxicity Dataset by Surge AI Saving the internet is fun. Combing through thousands of online comments to build a toxicity dataset isn't. That's wh

Surge AI 134 Dec 19, 2022
Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph".

multilingual-mrc-isdg Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph". This r

Liyan 5 Dec 07, 2022
SwinTrack: A Simple and Strong Baseline for Transformer Tracking

SwinTrack This is the official repo for SwinTrack. A Simple and Strong Baseline Prerequisites Environment conda (recommended) conda create -y -n SwinT

LitingLin 196 Jan 04, 2023
Revisiting Weakly Supervised Pre-Training of Visual Perception Models

SWAG: Supervised Weakly from hashtAGs This repository contains SWAG models from the paper Revisiting Weakly Supervised Pre-Training of Visual Percepti

Meta Research 134 Jan 05, 2023
CSKG is a commonsense knowledge graph that combines seven popular sources into a consolidated representation

CSKG: The CommonSense Knowledge Graph CSKG is a commonsense knowledge graph that combines seven popular sources into a consolidated representation: AT

USC ISI I2 85 Dec 12, 2022
Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training.

Updates (2020/06/21) Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training. Pyr

1.3k Jan 04, 2023
Object detection (YOLO) with pytorch, OpenCV and python

Real Time Object/Face Detection Using YOLO-v3 This project implements a real time object and face detection using YOLO algorithm. You only look once,

1 Aug 04, 2022
RobustART: Benchmarking Robustness on Architecture Design and Training Techniques

The first comprehensive Robustness investigation benchmark on large-scale dataset ImageNet regarding ARchitecture design and Training techniques towards diverse noises.

132 Dec 23, 2022
Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth [Paper]

Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth [Paper] Downloads [Downloads] Trained ckpt files for NYU Depth V2 and

98 Jan 01, 2023