BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

Related tags

Deep LearningBADet
Overview

BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

As of Apr. 17th, 2021, 1st place in KITTI BEV detection leaderboard and on par performance on KITTI 3D detection leaderboard. The detector can run at 7.1 FPS.

Authors: Rui Qian, Xin Lai, Xirong Li

[arXiv] [elsevier]

Citation

If you find this code useful in your research, please consider citing our work:

@InProceedings{qian2022pr,
author = {Rui Qian and Xin Lai and Xirong Li},
title = {BADet: Boundary-Aware 3D Object Detection from Point Clouds},
booktitle = {Pattern Recognition (PR)},
month = {January},
year = {2022}
}
@misc{qian20213d,
title={3D Object Detection for Autonomous Driving: A Survey}, 
author={Rui Qian and Xin Lai and Xirong Li},
year={2021},
eprint={2106.10823},
archivePrefix={arXiv},
primaryClass={cs.CV}
}

Updates

2021-03-17: The performance (using 40 recall poisitions) on test set is as follows:

Car [email protected], 0.70, 0.70:
bbox AP:98.75, 95.61, 90.64
bev  AP:95.23, 91.32, 86.48 
3d   AP:89.28, 81.61, 76.58 
aos  AP:98.65, 95.34, 90.28 

Introduction

model Currently, existing state-of-the-art 3D object detectors are in two-stage paradigm. These methods typically comprise two steps: 1) Utilize a region proposal network to propose a handful of high-quality proposals in a bottom-up fashion. 2) Resize and pool the semantic features from the proposed regions to summarize RoI-wise representations for further refinement. Note that these RoI-wise representations in step 2) are considered individually as uncorrelated entries when fed to following detection headers. Nevertheless, we observe these proposals generated by step 1) offset from ground truth somehow, emerging in local neighborhood densely with an underlying probability. Challenges arise in the case where a proposal largely forsakes its boundary information due to coordinate offset while existing networks lack corresponding information compensation mechanism. In this paper, we propose $BADet$ for 3D object detection from point clouds. Specifically, instead of refining each proposal independently as previous works do, we represent each proposal as a node for graph construction within a given cut-off threshold, associating proposals in the form of local neighborhood graph, with boundary correlations of an object being explicitly exploited. Besides, we devise a lightweight Region Feature Aggregation Module to fully exploit voxel-wise, pixel-wise, and point-wise features with expanding receptive fields for more informative RoI-wise representations. We validate BADet both on widely used KITTI Dataset and highly challenging nuScenes Dataset. As of Apr. 17th, 2021, our BADet achieves on par performance on KITTI 3D detection leaderboard and ranks $1^{st}$ on $Moderate$ difficulty of $Car$ category on KITTI BEV detection leaderboard. The source code is available at https://github.com/rui-qian/BADet.

Dependencies

  • python3.5+
  • pytorch (tested on 1.1.0)
  • opencv
  • shapely
  • mayavi
  • spconv (v1.0)

Installation

  1. Clone this repository.
  2. Compile C++/CUDA modules in mmdet/ops by running the following command at each directory, e.g.
$ cd mmdet/ops/points_op
$ python3 setup.py build_ext --inplace
  1. Setup following Environment variables, you may add them to ~/.bashrc:
export NUMBAPRO_CUDA_DRIVER=/usr/lib/x86_64-linux-gnu/libcuda.so
export NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so
export NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice
export LD_LIBRARY_PATH=/home/qianrui/anaconda3/lib/python3.7/site-packages/spconv;

Data Preparation

  1. Download the 3D KITTI detection dataset from here. Data to download include:

    • Velodyne point clouds (29 GB): input data to VoxelNet
    • Training labels of object data set (5 MB): input label to VoxelNet
    • Camera calibration matrices of object data set (16 MB): for visualization of predictions
    • Left color images of object data set (12 GB): for visualization of predictions
  2. Create cropped point cloud and sample pool for data augmentation, please refer to SECOND.

  3. Split the training set into training and validation set according to the protocol here.

  4. You could run the following command to prepare Data:

$ python3 tools/create_data.py

[email protected]:~/qianrui/kitti$ tree -L 1
data_root = '/home/qr/qianrui/kitti/'
├── gt_database
├── ImageSets
├── kitti_dbinfos_train.pkl
├── kitti_dbinfos_trainval.pkl
├── kitti_infos_test.pkl
├── kitti_infos_train.pkl
├── kitti_infos_trainval.pkl
├── kitti_infos_val.pkl
├── train.txt
├── trainval.txt
├── val.txt
├── test.txt
├── training   <-- training data
|       ├── image_2
|       ├── label_2
|       ├── velodyne
|       └── velodyne_reduced
└── testing  <--- testing data
|       ├── image_2
|       ├── label_2
|       ├── velodyne
|       └── velodyne_reduced

Pretrained Model

You can download the pretrained model [Model][Archive], which is trained on the train split (3712 samples) and evaluated on the val split (3769 samples) and test split (7518 samples). The performance (using 11 recall poisitions) on validation set is as follows:

[40, 1600, 1408]
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 3769/3769, 7.1 task/s, elapsed: 533s, ETA:     0s
Car [email protected], 0.70, 0.70:
bbox AP:98.27, 90.22, 89.66
bev  AP:90.59, 88.85, 88.09
3d   AP:90.06, 85.75, 78.98
aos  AP:98.18, 89.98, 89.25
Car [email protected], 0.50, 0.50:
bbox AP:98.27, 90.22, 89.66
bev  AP:98.31, 90.21, 89.73
3d   AP:98.20, 90.11, 89.61
aos  AP:98.18, 89.98, 89.25

Quick demo

You could run the following command to evaluate the pretrained model:

cd mmdet/tools
# vim ../configs/car_cfg.py(modify score_thr=0.4, score_thr=0.3 for val split and test split respectively.)
python3 test.py ../configs/car_cfg.py ../saved_model_vehicle/epoch_50.pth
Model Archive Parameters Moderate(Car) Pretrained Model Predicts
BADet(val) [Link] 44.2 MB 86.21% [icloud drive] [Results]
BADet(test) [Link] 44.2 MB 81.61% [icloud drive] [Results]

Training

To train the BADet with single GPU, run the following command:

cd mmdet/tools
python3 train.py ../configs/car_cfg.py

Inference

To evaluate the model, run the following command:

cd mmdet/tools
python3 test.py ../configs/car_cfg.py ../saved_model_vehicle/latest.pth

Acknowledgement

The code is devloped based on mmdetection, some part of codes are borrowed from SA-SSD, SECOND, and PointRCNN.

Contact

If you have questions, you can contact [email protected].

Owner
Rui Qian
Rui Qian
A smart Chat bot that can help to know about corona virus and Make prediction of corona using X-ray.

TRINIT_Hum_kuchh_nahi_karenge_ML01 Document Link https://github.com/Jatin-Goyal-552/TRINIT_Hum_kuchh_nahi_karenge_ML01/blob/main/hum_kuchh_nahi_kareng

JatinGoyal 1 Feb 03, 2022
Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch

Retrieval-Augmented Denoising Diffusion Probabilistic Models (wip) Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in P

Phil Wang 55 Jan 01, 2023
Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows.

Swin-Transformer Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows. For more details, ple

旷视天元 MegEngine 9 Mar 14, 2022
MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.

MicRank: Learning to Rank Microphones for Distant Speech Recognition Application Scenario Many applications nowadays envision the presence of multiple

Samuele Cornell 20 Nov 10, 2022
Official implementation for the paper: Permutation Invariant Graph Generation via Score-Based Generative Modeling

Permutation Invariant Graph Generation via Score-Based Generative Modeling This repo contains the official implementation for the paper Permutation In

64 Dec 29, 2022
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

LightHuBERT LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT | Github | Huggingface | SUPER

WangRui 46 Dec 29, 2022
Full-featured Decision Trees and Random Forests learner.

CID3 This is a full-featured Decision Trees and Random Forests learner. It can save trees or forests to disk for later use. It is possible to query tr

Alejandro Penate-Diaz 3 Aug 15, 2022
Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight)

Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning, NeurIPS 2021 (Spotlight) Abstract Due to the limited and even imbalanced dat

Hanzhe Hu 99 Dec 12, 2022
Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

EfficientZero (NeurIPS 2021) Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021. Thank you for you

Weirui Ye 671 Jan 03, 2023
Projects for AI/ML and IoT integration for games and other presented at re:Invent 2021.

Playground4AWS Projects for AI/ML and IoT integration for games and other presented at re:Invent 2021. Architecture Minecraft and Lamps This project i

Vinicius Senger 5 Nov 30, 2022
FACIAL: Synthesizing Dynamic Talking Face With Implicit Attribute Learning. ICCV, 2021.

FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning PyTorch implementation for the paper: FACIAL: Synthesizing Dynamic Talking

226 Jan 08, 2023
CLIP: Connecting Text and Image (Learning Transferable Visual Models From Natural Language Supervision)

CLIP (Contrastive Language–Image Pre-training) Experiments (Evaluation) Model Dataset Acc (%) ViT-B/32 (Paper) CIFAR100 65.1 ViT-B/32 (Our) CIFAR100 6

Myeongjun Kim 52 Jan 07, 2023
Computational modelling of ray propagation through optical elements using the principles of geometric optics (Ray Tracer)

Computational modelling of ray propagation through optical elements using the principles of geometric optics (Ray Tracer) Introduction By applying the

Son Gyo Jung 1 Jul 09, 2022
Repository for "Space-Time Correspondence as a Contrastive Random Walk" (NeurIPS 2020)

Space-Time Correspondence as a Contrastive Random Walk This is the repository for Space-Time Correspondence as a Contrastive Random Walk, published at

A. Jabri 239 Dec 27, 2022
Model-based reinforcement learning in TensorFlow

Bellman Website | Twitter | Documentation (latest) What does Bellman do? Bellman is a package for model-based reinforcement learning (MBRL) in Python,

46 Nov 09, 2022
Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network"

M3D-VTON: A Monocular-to-3D Virtual Try-On Network Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network" Paper | Suppl

109 Dec 29, 2022
Blender Add-On for slicing meshes with planes

MeshSlicer Blender Add-On for slicing meshes with multiple overlapping planes at once. This is a simple Blender addon to slice a silmple mesh with mul

52 Dec 12, 2022
converts nominal survey data into a numerical value based on a dictionary lookup.

SWAP RATE Converts nominal survey data into a numerical values based on a dictionary lookup. It allows the user to switch nominal scale data from text

Jake Rhodes 1 Jan 18, 2022
StrongSORT: Make DeepSORT Great Again

StrongSORT StrongSORT: Make DeepSORT Great Again StrongSORT: Make DeepSORT Great Again Yunhao Du, Yang Song, Bo Yang, Yanyun Zhao arxiv 2202.13514 Abs

369 Jan 04, 2023
Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

human-pose-estimation-3d-python-cpp RealSenseD435 (RGB) 480x640 + CPU Corei9 45 FPS (Depth is not used) 1. Run 1-1. RealSenseD435 (RGB) 480x640 + CPU

Katsuya Hyodo 8 Oct 03, 2022