[CVPR 2022] Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement

Overview

Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement

Announcement πŸ”₯

We have not tested the code yet. We will finish this project by April.

Introduction

This repo contains PyTorch implementation for paper Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement (CVPR2022)

overview

@inproceedings{xu2022br,
author = {Xu, Xiuwei and Wang, Yifan and Zheng, Yu and Rao, Yongming and Lu, Jiwen and Zhou, Jie},
title = {Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2022}
}

Other papers related to 3D object detection with synthetic shape:

  • RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection (ICCV 2021)

New dataset πŸ’₯

We conduct additional experiment on the more challenging Matterport3D dataset. From ModelNet40 and Matterport3D, we select all 13 shared categories, each containing more than 80 object instances in Matterport3D training set, to construct our benchmark (Matterport3d-md40). Below is the performance of FSB, WSB and BR (point-version) based on Votenet: overview

Note that we use OpenCV to estimate the rotated bounding boxes (RBB) as ground-truth, instead of the axis-aligned bounding boxes used in ScanNet-md40 benchmark.

ScanNet-md40 and Matterport3d-md40 are two more challenging benckmarks for indoor 3D object detection. We hope they will promote future research on small object detection and synthetic-to-real scene understanding.

Dependencies

We evaluate this code with Pytorch 1.8.1 (cuda11), which is based on the official implementation of Votenet and GroupFree3D. Please follow the requirements of them to prepare the environment. Other packages can be installed using:

pip install open3d sklearn tqdm

Current code base is tested under following environment:

  1. Python 3.6.13
  2. PyTorch 1.8.1
  3. numpy 1.19.2
  4. open3d 0.12.0
  5. opencv-python 4.5.1.48
  6. plyfile 0.7.3
  7. scikit-learn 0.24.1

Data preparation

ScanNet

To start from the raw data, you should:

  • Follow the README under GroupFree3D/scannet or Votenet/scannet to generate the real scenes.
  • Follow the README under ./data_generation/ScanNet to generate the virtual scenes.

The processed data can also be downloaded from here. They should be placed to paths:

./detection/Votenet/scannet/
./detection/GroupFree3D/scannet/

After that, the file directory should be like:

...
└── Votenet (or GroupFree3D)
    β”œβ”€β”€ ...
    └── scannet
        β”œβ”€β”€ ...
        β”œβ”€β”€ scannet_train_detection_data_md40
        β”œβ”€β”€ scannet_train_detection_data_md40_obj_aug
        └── scannet_train_detection_data_md40_obj_mesh_aug

Matterport3D

To start from the raw data, you should:

  • Follow the README under Votenet/scannet to generate the real scenes.
  • Follow the README under ./data_generation/Matterport3D to generate the virtual scenes.

The processed data can also be downloaded from here.

The file directory should be like:

...
└── Votenet
    β”œβ”€β”€ ...
    └── matterport
        β”œβ”€β”€ ...
        β”œβ”€β”€ matterport_train_detection_data_md40
        β”œβ”€β”€ matterport_train_detection_data_md40_obj_aug
        └── matterport_train_detection_data_md40_obj_mesh_aug

Usage

Please follow the instructions below to train different models on ScanNet. Change --dataset scannet to --dataset matterport for training on Matterport3D.

Votenet

1. Fully-Supervised Baseline

To train the Fully-Supervised Baseline (FSB) on Scannet data:

# Recommended GPU num: 1

cd Votenet

CUDA_VISIBLE_DEVICES=0 python train_Votenet_FSB.py --dataset scannet --log_dir log_Votenet_FSB --num_point 40000

2. Weakly-Supervised Baseline

To train the Weakly-Supervised Baseline (WSB) on Scannet data:

# Recommended num of GPUs: 1

CUDA_VISIBLE_DEVICES=0 python train_Votenet_WSB.py --dataset scannet --log_dir log_Votenet_WSB --num_point 40000

3. Back To Reality

To train BR (mesh-version) on Scannet data, please run:

# Recommended num of GPUs: 2

CUDA_VISIBLE_DEVICES=0,1 python train_Votenet_BR.py --dataset scannet --log_dir log_Votenet_BRM --num_point 40000

CUDA_VISIBLE_DEVICES=0,1 python train_Votenet_BR_CenterRefine --dataset scannet --log_dir log_Votenet_BRM_Refine --num_point 40000 --checkpoint_path log_Votenet_BRM/train_BR.tar

To train BR (point-version) on Scannet data, please run:

# Recommended num of GPUs: 2

CUDA_VISIBLE_DEVICES=0,1 python train_Votenet_BR.py --dataset scannet --log_dir log_Votenet_BRP --num_point 40000 --dataset_without_mesh

CUDA_VISIBLE_DEVICES=0,1 python train_Votenet_BR_CenterRefine --dataset scannet --log_dir log_Votenet_BRP_Refine --num_point 40000 --checkpoint_path log_Votenet_BRP/train_BR.tar --dataset_without_mesh

GroupFree3D

1. Fully-Supervised Baseline

To train the Fully-Supervised Baseline (FSB) on Scannet data:

# Recommended num of GPUs: 4

cd GroupFree3D

python -m torch.distributed.launch --master_port <port_num> --nproc_per_node <num_of_gpus_to_use> train_GF_FSB.py --num_point 50000 --num_decoder_layers 6 --size_delta 0.111111111111 --center_delta 0.04 --learning_rate 0.006 --decoder_learning_rate 0.0006 --weight_decay 0.0005 --dataset scannet --log_dir log_GF_FSB --batch_size 4

2. Weakly-Supervised Baseline

To train the Weakly-Supervised Baseline (WSB) on Scannet data:

# Recommended num of GPUs: 4

python -m torch.distributed.launch --master_port <port_num> --nproc_per_node <num_of_gpus_to_use> train_GF_WSB.py --num_point 50000 --num_decoder_layers 6 --size_delta 0.111111111111 --center_delta 0.04 --learning_rate 0.006 --decoder_learning_rate 0.0006 --weight_decay 0.0005 --dataset scannet --log_dir log_GF_WSB --batch_size 4

3. Back To Reality

To train BR (mesh-version) on Scannet data, please run:

# Recommended num of GPUs: 4

python -m torch.distributed.launch --master_port <port_num> --nproc_per_node <num_of_gpus_to_use> train_GF_BR.py --num_point 50000 --num_decoder_layers 6 --size_delta 0.111111111111 --center_delta 0.04 --learning_rate 0.006 --decoder_learning_rate 0.0006 --weight_decay 0.0005 --dataset scannet --log_dir log_GF_BRM --batch_size 4

# Recommended num of GPUs: 6

python -m torch.distributed.launch --master_port <port_num> --nproc_per_node <num_of_gpus_to_use> train_GF_BR_CenterRefine.py --num_point 50000 --num_decoder_layers 6 --size_delta 0.111111111111 --center_delta 0.04 --learning_rate 0.001 --decoder_learning_rate 0.0006 --weight_decay 0.0005 --dataset scannet --log_dir log_GF_BRM_Refine --checkpoint_path <[checkpoint_path_of_groupfree3D]/ckpt_epoch_last.pth> --max_epoch 120 --val_freq 10 --save_freq 20 --batch_size 2

To train BR (point-version) on Scannet data, please run:

# Recommended num of GPUs: 4

python -m torch.distributed.launch --master_port <port_num> --nproc_per_node <num_of_gpus_to_use> train_GF_BR.py --num_point 50000 --num_decoder_layers 6 --size_delta 0.111111111111 --center_delta 0.04 --learning_rate 0.006 --decoder_learning_rate 0.0006 --weight_decay 0.0005 --dataset scannet --log_dir log_GF_BRP --batch_size 4 --dataset_without_mesh

# Recommended num of GPUs: 6

python -m torch.distributed.launch --master_port <port_num> --nproc_per_node <num_of_gpus_to_use> train_GF_BR_CenterRefine.py --num_point 50000 --num_decoder_layers 6 --size_delta 0.111111111111 --center_delta 0.04 --learning_rate 0.001 --decoder_learning_rate 0.0006 --weight_decay 0.0005 --dataset scannet --log_dir log_GF_BRP_Refine --checkpoint_path <[checkpoint_path_of_groupfree3D]/ckpt_epoch_last.pth> --max_epoch 120 --val_freq 10 --save_freq 20 --batch_size 2 --dataset_without_mesh

TODO list

We will add the following to this repo:

  • Virtual scene generation for Matterport3D
  • Data and code for training Votenet (both baseline and BR) on the Matterport3D dataset

Acknowledgements

We thank a lot for the flexible codebase of Votenet and GroupFree3D.

Owner
Xiuwei Xu
3D vision, data/computation-efficient learning
Xiuwei Xu
A knowledge base construction engine for richly formatted data

Fonduer is a Python package and framework for building knowledge base construction (KBC) applications from richly formatted data. Note that Fonduer is

HazyResearch 386 Dec 05, 2022
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. Website β€’ Key Features β€’ How To Use β€’ Docs β€’

Pytorch Lightning 21.1k Jan 01, 2023
Implementation for NeurIPS 2021 Submission: SparseFed

READ THIS FIRST This repo is an anonymized version of an existing repository of GitHub, for the AIStats 2021 submission: SparseFed: Mitigating Model P

2 Jun 15, 2022
Repository for the electrical and ICT benchmark model developed in the ERIGrid 2.0 project.

Benchmark Model Electrical and ICT System This repository contains the documentation, code, and models for the electrical and ICT benchmark model deve

ERIGrid 2.0 1 Nov 29, 2021
Building a real-time environment using webcam frame division in OpenCV and classify cropped images using a fine-tuned vision transformers on hybryd datasets samples for facial emotion recognition.

Visual Transformer for Facial Emotion Recognition (FER) This project has the aim to build an efficient Visual Transformer for the Facial Emotion Recog

Mario Sessa 8 Dec 12, 2022
Dataset and Code for the paper "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021), and "Depth-only Object Tracking" (BMVC2021)

DeT and DOT Code and datasets for "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021) "Depth-only Object Tracking" (BMVC2021) @InProceedings

Yan Song 55 Dec 15, 2022
DynaTune: Dynamic Tensor Program Optimization in Deep Neural Network Compilation

DynaTune: Dynamic Tensor Program Optimization in Deep Neural Network Compilation This repository is the implementation of DynaTune paper. This folder

4 Nov 02, 2022
Code for Subgraph Federated Learning with Missing Neighbor Generation (NeurIPS 2021)

To run the code Unzip the package to your local directory; Run 'pip install -r requirements.txt' to download required packages; Open file ~/nips_code/

32 Dec 26, 2022
A high-performance distributed deep learning system targeting large-scale and automated distributed training.

HETU Documentation | Examples Hetu is a high-performance distributed deep learning system targeting trillions of parameters DL model training, develop

DAIR Lab 150 Dec 21, 2022
Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

CSRL Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning Python: 3

4 Apr 14, 2022
This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

Intro This is the repository for CVPR2021 Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales Vehicle Sam

39 Jul 21, 2022
This code is for eCaReNet: explainable Cancer Relapse Prediction Network.

eCaReNet This code is for eCaReNet: explainable Cancer Relapse Prediction Network. (Towards Explainable End-to-End Prostate Cancer Relapse Prediction

Institute of Medical Systems Biology 2 Jul 28, 2022
Portfolio analytics for quants, written in Python

QuantStats: Portfolio analytics for quants QuantStats Python library that performs portfolio profiling, allowing quants and portfolio managers to unde

Ran Aroussi 2.7k Jan 08, 2023
AnimationKit: AI Upscaling & Interpolation using Real-ESRGAN+RIFE

ALPHA 2.5: Frostbite Revival (Released 12/23/21) Changelog: [ UI ] Chained design. All steps link to one another! Use the master override toggles to s

87 Nov 16, 2022
Implementation of BI-RADS-BERT & The Advantages of Section Tokenization.

BI-RADS BERT Implementation of BI-RADS-BERT & The Advantages of Section Tokenization. This implementation could be used on other radiology in house co

1 May 17, 2022
Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

Soubhik Sanyal 689 Dec 25, 2022
γ€ŠUnsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement》(ECCV 2020) GitHub: [fig9]

Unsupervised 3D Human Pose Representation [Paper] The implementation of our paper Unsupervised 3D Human Pose Representation with Viewpoint and Pose Di

42 Nov 24, 2022
FaceAPI: AI-powered Face Detection & Rotation Tracking, Face Description & Recognition, Age & Gender & Emotion Prediction for Browser and NodeJS using TensorFlow/JS

FaceAPI AI-powered Face Detection & Rotation Tracking, Face Description & Recognition, Age & Gender & Emotion Prediction for Browser and NodeJS using

Vladimir Mandic 395 Dec 29, 2022
Subnet Replacement Attack: Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks

Subnet Replacement Attack: Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks Official implementation of paper Towards Practic

Xiangyu Qi 8 Dec 30, 2022
Files for a tutorial to train SegNet for road scenes using the CamVid dataset

SegNet and Bayesian SegNet Tutorial This repository contains all the files for you to complete the 'Getting Started with SegNet' and the 'Bayesian Seg

Alex Kendall 800 Dec 31, 2022