Implementation of CVPR'21: RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

Overview

RfD-Net [Project Page] [Paper] [Video]

RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction
Yinyu Nie, Ji Hou, Xiaoguang Han, Matthias Nießner
In CVPR, 2021.

points.png pred.png

From an incomplete point cloud of a 3D scene (left), our method learns to jointly understand the 3D objects and reconstruct instance meshes as the output (right).


Install

  1. This implementation uses Python 3.6, Pytorch1.7.1, cudatoolkit 11.0. We recommend to use conda to deploy the environment.

    • Install with conda:
    conda env create -f environment.yml
    conda activate rfdnet
    
    • Install with pip:
    pip install -r requirements.txt
    
  2. Next, compile the external libraries by

    python setup.py build_ext --inplace
    
  3. Install PointNet++ by

    cd external/pointnet2_ops_lib
    pip install .
    

Demo

The pretrained model can be downloaded here. Put the pretrained model in the directory as below

out/pretrained_models/pretrained_weight.pth

A demo is illustrated below to see how our method works.

cd RfDNet
python main.py --config configs/config_files/ISCNet_test.yaml --mode demo --demo_path demo/inputs/scene0549_00.off

VTK is used here to visualize the 3D scenes. The outputs will be saved under 'demo/outputs'. You can also play with your toy with this script.

If everything goes smooth, there will be a GUI window popped up and you can interact with the scene as below. screenshot_demo.png

If you run it on machines without X display server, you can use the offscreen mode by setting offline=True in demo.py. The rendered image will be saved in demo/outputs/some_scene_id/pred.png.


Prepare Data

In our paper, we use the input point cloud from the ScanNet dataset, and the annotated instance CAD models from the Scan2CAD dataset. Scan2CAD aligns the object CAD models from ShapeNetCore.v2 to each object in ScanNet, and we use these aligned CAD models as the ground-truth.

Preprocess ScanNet and Scan2CAD data

You can either directly download the processed samples [link] to the directory below (recommended)

datasets/scannet/processed_data/

or

  1. Ask for the ScanNet dataset and download it to
    datasets/scannet/scans
    
  2. Ask for the Scan2CAD dataset and download it to
    datasets/scannet/scan2cad_download_link
    
  3. Preprocess the ScanNet and Scan2CAD dataset for training by
    cd RfDNet
    python utils/scannet/gen_scannet_w_orientation.py
    
Preprocess ShapeNet data

You can either directly download the processed data [link] and extract them to datasets/ShapeNetv2_data/ as below

datasets/ShapeNetv2_data/point
datasets/ShapeNetv2_data/pointcloud
datasets/ShapeNetv2_data/voxel
datasets/ShapeNetv2_data/watertight_scaled_simplified

or

  1. Download ShapeNetCore.v2 to the path below

    datasets/ShapeNetCore.v2
    
  2. Process ShapeNet models into watertight meshes by

    python utils/shapenet/1_fuse_shapenetv2.py
    
  3. Sample points on ShapeNet models for training (similar to Occupancy Networks).

    python utils/shapenet/2_sample_mesh.py --resize --packbits --float16
    
  4. There are usually 100K+ points per object mesh. We simplify them to speed up our testing and visualization by

    python utils/shapenet/3_simplify_fusion.py --in_dir datasets/ShapeNetv2_data/watertight_scaled --out_dir datasets/ShapeNetv2_data/watertight_scaled_simplified
    
Verify preprocessed data

After preprocessed the data, you can run the visualization script below to check if they are generated correctly.

  • Visualize ScanNet+Scan2CAD+ShapeNet samples by

    python utils/scannet/visualization/vis_gt.py
    

    A VTK window will be popped up like below.

    verify.png

Training, Generating and Evaluation

We use the configuration file (see 'configs/config_files/****.yaml') to fully control the training/testing/generating process. You can check a template at configs/config_files/ISCNet.yaml.

Training

We firstly pretrain our detection module and completion module followed by a joint refining. You can follow the process below.

  1. Pretrain the detection module by

    python main.py --config configs/config_files/ISCNet_detection.yaml --mode train
    

    It will save the detection module weight at out/iscnet/a_folder_with_detection_module/model_best.pth

  2. Copy the weight path of detection module (see 1.) into configs/config_files/ISCNet_completion.yaml as

    weight: ['out/iscnet/a_folder_with_detection_module/model_best.pth']
    

    Then pretrain the completion module by

    python main.py --config configs/config_files/ISCNet_completion.yaml --mode train
    

    It will save the completion module weight at out/iscnet/a_folder_with_completion_module/model_best.pth

  3. Copy the weight path of completion module (see 2.) into configs/config_files/ISCNet.yaml as

    weight: ['out/iscnet/a_folder_with_completion_module/model_best.pth']
    

    Then jointly finetune RfD-Net by

    python main.py --config configs/config_files/ISCNet.yaml --mode train
    

    It will save the trained model weight at out/iscnet/a_folder_with_RfD-Net/model_best.pth

Generating

Copy the weight path of RfD-Net (see 3. above) into configs/config_files/ISCNet_test.yaml as

weight: ['out/iscnet/a_folder_with_RfD-Net/model_best.pth']

Run below to output all scenes in the test set.

python main.py --config configs/config_files/ISCNet_test.yaml --mode test

The 3D scenes for visualization are saved in the folder of out/iscnet/a_folder_with_generated_scenes/visualization. You can visualize a triplet of (input, pred, gt) following a demo below

python utils/scannet/visualization/vis_for_comparison.py 

If everything goes smooth, there will be three windows (corresponding to input, pred, gt) popped up by sequence as

Input Prediction Ground-truth

Evaluation

You can choose each of the following ways for evaluation.

  1. You can export all scenes above to calculate the evaluation metrics with any external library (for researchers who would like to unify the benchmark). Lower the dump_threshold in ISCNet_test.yaml in generation to enable more object proposals for mAP calculation (e.g. dump_threshold=0.05).

  2. In our evaluation, we voxelize the 3D scenes to keep consistent resolution with the baseline methods. To enable this,

    1. make sure the executable binvox are downloaded and configured as an experiment variable (e.g. export its path in ~/.bashrc for Ubuntu). It will be deployed by Trimesh.

    2. Change the ISCNet_test.yaml as below for evaluation.

       test:
         evaluate_mesh_mAP: True
       generation:
         dump_results: False
    

    Run below to report the evaluation results.

    python main.py --config configs/config_files/ISCNet_test.yaml --mode test
    

    The log file will saved in out/iscnet/a_folder_named_with_script_time/log.txt


Differences to the paper

  1. The original paper was implemented with Pytorch 1.1.0, and we reconfigure our code to fit with Pytorch 1.7.1.
  2. A post processing step to align the reconstructed shapes to the input scan is supported. We have verified that it can improve the evaluation performance by a small margin. You can switch on/off it following demo.py.
  3. A different learning rate scheduler is adopted. The learning rate decreases to 0.1x if there is no gain within 20 steps, which is much more efficient.

Citation

If you find our work helpful, please consider citing

@inproceedings{Nie_2021_CVPR,
    title={RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction},
    author={Nie, Yinyu and Hou, Ji and Han, Xiaoguang and Nie{\ss}ner, Matthias},
    booktitle={Proc. Computer Vision and Pattern Recognition (CVPR), IEEE},
    year={2021}
}


License

RfD-Net is relased under the MIT License. See the LICENSE file for more details.

Owner
Yinyu Nie
currently a Ph.D. student at NCCA, Bournemouth University.
Yinyu Nie
Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible

Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible, to be the most reliable with the least com

Nikolas B Virionis 2 Aug 01, 2022
Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch

Omninet - Pytorch Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch. The authors propose that we should be atte

Phil Wang 48 Nov 21, 2022
Voxel Transformer for 3D object detection

Voxel Transformer This is a reproduced repo of Voxel Transformer for 3D object detection. The code is mainly based on OpenPCDet. Introduction We provi

173 Dec 25, 2022
Diffgram - Supervised Learning Data Platform

Data Annotation, Data Labeling, Annotation Tooling, Training Data for Machine Learning

Diffgram 1.6k Jan 07, 2023
This is the implementation of the paper LiST: Lite Self-training Makes Efficient Few-shot Learners.

LiST (Lite Self-Training) This is the implementation of the paper LiST: Lite Self-training Makes Efficient Few-shot Learners. LiST is short for Lite S

Microsoft 28 Dec 07, 2022
PyTorch implementation for Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuous Sign Language Recognition.

Stochastic CSLR This is the PyTorch implementation for the ECCV 2020 paper: Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuou

Zhe Niu 28 Dec 19, 2022
OpenMMLab 3D Human Parametric Model Toolbox and Benchmark

Introduction English | 简体中文 MMHuman3D is an open source PyTorch-based codebase for the use of 3D human parametric models in computer vision and comput

OpenMMLab 782 Jan 04, 2023
Pytorch Lightning Implementation of SC-Depth Methods.

SC_Depth_pl: This is a pytorch lightning implementation of SC-Depth (V1, V2) for self-supervised learning of monocular depth from video. In the V1 (IJ

JiaWang Bian 216 Dec 30, 2022
Implementation of Uformer, Attention-based Unet, in Pytorch

Uformer - Pytorch Implementation of Uformer, Attention-based Unet, in Pytorch. It will only offer the concat-cross-skip connection. This repository wi

Phil Wang 72 Dec 19, 2022
ADB-IP-ROTATION - Use your mobile phone to gain a temporary IP address using ADB and data tethering

ADB IP ROTATE This an Python script based on Android Debug Bridge (adb) shell sc

Dor Bismuth 2 Jul 12, 2022
A set of tools for converting a darknet dataset to COCO format working with YOLOX

darknet格式数据→COCO darknet训练数据目录结构(详情参见dataset/darknet): darknet ├── class.names ├── gen_config.data ├── gen_train.txt ├── gen_valid.txt └── images

RapidAI-NG 148 Jan 03, 2023
How will electric vehicles affect traffic congestion and energy consumption: an integrated modelling approach

EV-charging-impact This repository contains the code that has been used for the Queue modelling for the paper "How will electric vehicles affect traff

7 Nov 30, 2022
Understanding Convolutional Neural Networks from Theoretical Perspective via Volterra Convolution

nnvolterra Run Code Compile first: make compile Run all codes: make all Test xconv: make npxconv_test MNIST dataset needs to be downloaded, converted

1 May 24, 2022
3D-CariGAN: An End-to-End Solution to 3D Caricature Generation from Normal Face Photos

3D-CariGAN: An End-to-End Solution to 3D Caricature Generation from Normal Face Photos This repository contains the source code and dataset for the pa

54 Oct 09, 2022
This is a Python Module For Encryption, Hashing And Other stuff

EnroCrypt This is a Python Module For Encryption, Hashing And Other Basic Stuff You Need, With Secure Encryption And Strong Salted Hashing You Can Do

5 Sep 15, 2022
Pretraining on Dynamic Graph Neural Networks

Pretraining on Dynamic Graph Neural Networks Our article is PT-DGNN and the code is modified based on GPT-GNN Requirements python 3.6 Ubuntu 18.04.5 L

7 Dec 17, 2022
Implementation of CVPR 2020 Dual Super-Resolution Learning for Semantic Segmentation

Dual super-resolution learning for semantic segmentation 2021-01-02 Subpixel Update Happy new year! The 2020-12-29 update of SISR with subpixel conv p

Sam 79 Nov 24, 2022
Deep learning for Engineers - Physics Informed Deep Learning

SciANN: Neural Networks for Scientific Computations SciANN is a Keras wrapper for scientific computations and physics-informed deep learning. New to S

SciANN 195 Jan 03, 2023
Reproduced Code for Image Forgery Detection papers.

Image Forgery Detection With over 4.5 billion active internet users, the amount of multimedia content being shared every day has surpassed everyone’s

Umar Masud 15 Dec 06, 2022
GANTheftAuto is a fork of the Nvidia's GameGAN

Description GANTheftAuto is a fork of the Nvidia's GameGAN, which is research focused on emulating dynamic game environments. The early research done

Harrison 801 Dec 27, 2022