Build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF

Overview

Semantic-NeRF: Semantic Neural Radiance Fields

Project Page | Video | Paper | Data

In-Place Scene Labelling and Understanding with Implicit Scene Representation
Shuaifeng Zhi, Tristan Laidlow, Stefan Leutenegger, Andrew J. Davison,
Dyson Robotics Laboratory at Imperial College
Published in ICCV 2021 (Oral Presentation)

We build upon neural radiance fields to create a scene-specific implicit 3D semantic representation, Semantic-NeRF.

Getting Started

For flawless reproduction of our results, the Ubuntu OS 20.04 is recommended. The models have been tested using Python 3.7, Pytorch 1.6.0, CUDA10.1. Higher versions should also perform similarly.

Dependencies

Main python dependencies are listed below:

  • Python >=3.7
  • torch>=1.6.0 (integrate searchsorted API, otherwise need to use the third party implementation SearchSorted)
  • cudatoolkit>=10.1

Following packages are used for 3D mesh reconstruction:

  • trimesh==3.9.9
  • open3d==0.12.0

With Anaconda, you can simply create a virtual environment and install dependencies with CONDA by:

  • conda create -n semantic_nerf python=3.7
  • conda activate semantic_nerf
  • pip install -r requirements.txt

Datasets

We mainly use Replica and ScanNet datasets for experiments, where we train a new Semantic-NeRF model on each 3D scene. Other similar indoor datasets with colour images, semantic labels and poses can also be used.

We also provide pre-rendered Replica data that can be directly used by Semantic-NeRF.

Running code

After cloning the codes, we can start to run Semantic-NeRF in the root directory of the repository.

Semantic-NeRF training

For standard Semantic-NeRF training with full dense semantic supervision. You can simply run following command with a chosen config file specifying data directory and hyper-params.

python3 train_SSR_main.py --config_file /SSR/configs/SSR_room0_config.yaml

Different working modes and set-ups can be chosen via commands:

Semantic View Synthesis with Sparse Labels:

python3 train_SSR_main.py --sparse_views --sparse_ratio 0.6

Sparse ratio here is the portion of dropped frames in the training sequence.

Pixel-wise Denoising Task:

python3 train_SSR_main.py --pixel_denoising --pixel_noise_ratio 0.5

We could also use a sparse set of frames along with denoising task:

python3 train_SSR_main.py --pixel_denoising --pixel_noise_ratio 0.5 --sparse_views --sparse_ratio 0.6

Region-wise Denoising task (For Replica Room2):

python3 train_SSR_main.py --region_denoising --region_noise_ratio 0.3

The argument uniform_flip corresponds to the two modes of "Even/Sort"in region-wise denoising task.

Super-Resolution Task:

For super-resolution with dense labels, please run

python3 train_SSR_main.py --super_resolution --sr_factor 8 --dense_sr

For super-resolution with sparse labels, please run

python3 train_SSR_main.py --super_resolution --sr_factor 8

Label Propagation Task:

For label propagation task with single-click seed regions, please run

python3 train_SSR_main.py --label_propagation --partial_perc 0

In order to improve reproducibility, for denoising and label-propagation tasks, we can also include --visualise_save and --load_saved to save/load randomly generated labels.

3D Reconstruction of Replica Scenes

We also provide codes for extracting 3D semantic mesh from a trained Seamntic-NeRF model.

python3 SSR/extract_colour_mesh.py --sem --mesh_dir PATH_TO_MESH --mesh_dir PATH_TO_MESH  --training_data_dir PATH_TO_TRAINING_DATA --save_dir PATH_TO_SAVE_DIR

For more demos and qualitative results, please check our project page and video.

Acknowledgements

Thanks nerf, nerf-pytorch and nerf_pl for providing nice and inspiring implementations of NeRF.

Citation

If you found this code/work to be useful in your own research, please consider citing the following:

@inproceedings{Zhi:etal:ICCV2021,
  title={In-Place Scene Labelling and Understanding with Implicit Scene Representation},
  author={Shuaifeng Zhi and Tristan Laidlow and Stefan Leutenegger and Andrew J. Davison},
  booktitle=ICCV,
  year={2021}
}

Contact

If you have any questions, please contact [email protected] or [email protected].

Owner
Shuaifeng Zhi
PhD student in Dyson Robotics Laboratory at Imperial College London
Shuaifeng Zhi
[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.

MiVOS (CVPR 2021) - Mask Propagation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [arXiv] [Paper PDF] [Project Page] [Papers with Code] This repo impleme

Rex Cheng 106 Jan 03, 2023
Compute FID scores with PyTorch.

FID score for PyTorch This is a port of the official implementation of Fréchet Inception Distance to PyTorch. See https://github.com/bioinf-jku/TTUR f

2.1k Jan 06, 2023
[ICML 2020] "When Does Self-Supervision Help Graph Convolutional Networks?" by Yuning You, Tianlong Chen, Zhangyang Wang, Yang Shen

When Does Self-Supervision Help Graph Convolutional Networks? PyTorch implementation for When Does Self-Supervision Help Graph Convolutional Networks?

Shen Lab at Texas A&M University 106 Nov 11, 2022
Organseg dags - The repository contains the codebase for multi-organ segmentation with directed acyclic graphs (DAGs) in CT.

Organseg dags - The repository contains the codebase for multi-organ segmentation with directed acyclic graphs (DAGs) in CT.

yzf 1 Jun 12, 2022
Towards Interpretable Deep Metric Learning with Structural Matching

DIML Created by Wenliang Zhao*, Yongming Rao*, Ziyi Wang, Jiwen Lu, Jie Zhou This repository contains PyTorch implementation for paper Towards Interpr

Wenliang Zhao 75 Nov 11, 2022
The implementation for paper Joint t-SNE for Comparable Projections of Multiple High-Dimensional Datasets.

Joint t-sne This is the implementation for paper Joint t-SNE for Comparable Projections of Multiple High-Dimensional Datasets. abstract: We present Jo

IDEAS Lab 7 Dec 18, 2022
Official PyTorch Implementation of Hypercorrelation Squeeze for Few-Shot Segmentation, arXiv 2021

Hypercorrelation Squeeze for Few-Shot Segmentation This is the implementation of the paper "Hypercorrelation Squeeze for Few-Shot Segmentation" by Juh

Juhong Min 165 Dec 28, 2022
Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System Authors: Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai

Amazon Web Services - Labs 123 Dec 23, 2022
DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by Transferring from GANs

DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by Transferring from GANs Abstract: Image-to-image translation has recently achieved re

yaxingwang 23 Apr 14, 2022
Versatile Generative Language Model

Versatile Generative Language Model This is the implementation of the paper: Exploring Versatile Generative Language Model Via Parameter-Efficient Tra

Zhaojiang Lin 17 Dec 02, 2022
CoANet: Connectivity Attention Network for Road Extraction From Satellite Imagery

CoANet: Connectivity Attention Network for Road Extraction From Satellite Imagery This paper (CoANet) has been published in IEEE TIP 2021. This code i

Jie Mei 53 Dec 03, 2022
Unrolled Generative Adversarial Networks

Unrolled Generative Adversarial Networks Luke Metz, Ben Poole, David Pfau, Jascha Sohl-Dickstein arxiv:1611.02163 This repo contains an example notebo

Ben Poole 292 Dec 06, 2022
DEEPAGÉ: Answering Questions in Portuguese about the Brazilian Environment

DEEPAGÉ: Answering Questions in Portuguese about the Brazilian Environment This repository is related to the paper DEEPAGÉ: Answering Questions in Por

0 Dec 10, 2021
TriMap: Large-scale Dimensionality Reduction Using Triplets

TriMap TriMap is a dimensionality reduction method that uses triplet constraints to form a low-dimensional embedding of a set of points. The triplet c

Ehsan Amid 235 Dec 24, 2022
3D cascade RCNN for object detection on point cloud

3D Cascade RCNN This is the implementation of 3D Cascade RCNN: High Quality Object Detection in Point Clouds. We designed a 3D object detection model

Qi Cai 22 Dec 02, 2022
Pixel-level Crack Detection From Images Of Levee Systems : A Comparative Study

PIXEL-LEVEL CRACK DETECTION FROM IMAGES OF LEVEE SYSTEMS : A COMPARATIVE STUDY G

Manisha Panta 2 Jul 23, 2022
TensorFlow-based implementation of "Pyramid Scene Parsing Network".

PSPNet_tensorflow Important Code is fine for inference. However, the training code is just for reference and might be only used for fine-tuning. If yo

HsuanKung Yang 323 Dec 20, 2022
Source Code for Simulations in the Publication "Can the brain use waves to solve planning problems?"

Code for Simulations in the Publication Can the brain use waves to solve planning problems? Installing Required Python Packages Please use Python vers

EMD Group 2 Jul 01, 2022
Graph parsing approach to structured sentiment analysis.

Fine-grained Sentiment Analysis as Dependency Graph Parsing This repository contains the code and datasets described in following paper: Fine-grained

Jeremy Barnes 36 Dec 12, 2022