BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

Related tags

Deep LearningBADet
Overview

BADet: Boundary-Aware 3D Object Detection from Point Clouds (Pattern Recognition 2022)

As of Apr. 17th, 2021, 1st place in KITTI BEV detection leaderboard and on par performance on KITTI 3D detection leaderboard. The detector can run at 7.1 FPS.

Authors: Rui Qian, Xin Lai, Xirong Li

[arXiv] [elsevier]

Citation

If you find this code useful in your research, please consider citing our work:

@InProceedings{qian2022pr,
author = {Rui Qian and Xin Lai and Xirong Li},
title = {BADet: Boundary-Aware 3D Object Detection from Point Clouds},
booktitle = {Pattern Recognition (PR)},
month = {January},
year = {2022}
}
@misc{qian20213d,
title={3D Object Detection for Autonomous Driving: A Survey}, 
author={Rui Qian and Xin Lai and Xirong Li},
year={2021},
eprint={2106.10823},
archivePrefix={arXiv},
primaryClass={cs.CV}
}

Updates

2021-03-17: The performance (using 40 recall poisitions) on test set is as follows:

Car [email protected], 0.70, 0.70:
bbox AP:98.75, 95.61, 90.64
bev  AP:95.23, 91.32, 86.48 
3d   AP:89.28, 81.61, 76.58 
aos  AP:98.65, 95.34, 90.28 

Introduction

model Currently, existing state-of-the-art 3D object detectors are in two-stage paradigm. These methods typically comprise two steps: 1) Utilize a region proposal network to propose a handful of high-quality proposals in a bottom-up fashion. 2) Resize and pool the semantic features from the proposed regions to summarize RoI-wise representations for further refinement. Note that these RoI-wise representations in step 2) are considered individually as uncorrelated entries when fed to following detection headers. Nevertheless, we observe these proposals generated by step 1) offset from ground truth somehow, emerging in local neighborhood densely with an underlying probability. Challenges arise in the case where a proposal largely forsakes its boundary information due to coordinate offset while existing networks lack corresponding information compensation mechanism. In this paper, we propose $BADet$ for 3D object detection from point clouds. Specifically, instead of refining each proposal independently as previous works do, we represent each proposal as a node for graph construction within a given cut-off threshold, associating proposals in the form of local neighborhood graph, with boundary correlations of an object being explicitly exploited. Besides, we devise a lightweight Region Feature Aggregation Module to fully exploit voxel-wise, pixel-wise, and point-wise features with expanding receptive fields for more informative RoI-wise representations. We validate BADet both on widely used KITTI Dataset and highly challenging nuScenes Dataset. As of Apr. 17th, 2021, our BADet achieves on par performance on KITTI 3D detection leaderboard and ranks $1^{st}$ on $Moderate$ difficulty of $Car$ category on KITTI BEV detection leaderboard. The source code is available at https://github.com/rui-qian/BADet.

Dependencies

  • python3.5+
  • pytorch (tested on 1.1.0)
  • opencv
  • shapely
  • mayavi
  • spconv (v1.0)

Installation

  1. Clone this repository.
  2. Compile C++/CUDA modules in mmdet/ops by running the following command at each directory, e.g.
$ cd mmdet/ops/points_op
$ python3 setup.py build_ext --inplace
  1. Setup following Environment variables, you may add them to ~/.bashrc:
export NUMBAPRO_CUDA_DRIVER=/usr/lib/x86_64-linux-gnu/libcuda.so
export NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so
export NUMBAPRO_LIBDEVICE=/usr/local/cuda/nvvm/libdevice
export LD_LIBRARY_PATH=/home/qianrui/anaconda3/lib/python3.7/site-packages/spconv;

Data Preparation

  1. Download the 3D KITTI detection dataset from here. Data to download include:

    • Velodyne point clouds (29 GB): input data to VoxelNet
    • Training labels of object data set (5 MB): input label to VoxelNet
    • Camera calibration matrices of object data set (16 MB): for visualization of predictions
    • Left color images of object data set (12 GB): for visualization of predictions
  2. Create cropped point cloud and sample pool for data augmentation, please refer to SECOND.

  3. Split the training set into training and validation set according to the protocol here.

  4. You could run the following command to prepare Data:

$ python3 tools/create_data.py

[email protected]:~/qianrui/kitti$ tree -L 1
data_root = '/home/qr/qianrui/kitti/'
├── gt_database
├── ImageSets
├── kitti_dbinfos_train.pkl
├── kitti_dbinfos_trainval.pkl
├── kitti_infos_test.pkl
├── kitti_infos_train.pkl
├── kitti_infos_trainval.pkl
├── kitti_infos_val.pkl
├── train.txt
├── trainval.txt
├── val.txt
├── test.txt
├── training   <-- training data
|       ├── image_2
|       ├── label_2
|       ├── velodyne
|       └── velodyne_reduced
└── testing  <--- testing data
|       ├── image_2
|       ├── label_2
|       ├── velodyne
|       └── velodyne_reduced

Pretrained Model

You can download the pretrained model [Model][Archive], which is trained on the train split (3712 samples) and evaluated on the val split (3769 samples) and test split (7518 samples). The performance (using 11 recall poisitions) on validation set is as follows:

[40, 1600, 1408]
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 3769/3769, 7.1 task/s, elapsed: 533s, ETA:     0s
Car [email protected], 0.70, 0.70:
bbox AP:98.27, 90.22, 89.66
bev  AP:90.59, 88.85, 88.09
3d   AP:90.06, 85.75, 78.98
aos  AP:98.18, 89.98, 89.25
Car [email protected], 0.50, 0.50:
bbox AP:98.27, 90.22, 89.66
bev  AP:98.31, 90.21, 89.73
3d   AP:98.20, 90.11, 89.61
aos  AP:98.18, 89.98, 89.25

Quick demo

You could run the following command to evaluate the pretrained model:

cd mmdet/tools
# vim ../configs/car_cfg.py(modify score_thr=0.4, score_thr=0.3 for val split and test split respectively.)
python3 test.py ../configs/car_cfg.py ../saved_model_vehicle/epoch_50.pth
Model Archive Parameters Moderate(Car) Pretrained Model Predicts
BADet(val) [Link] 44.2 MB 86.21% [icloud drive] [Results]
BADet(test) [Link] 44.2 MB 81.61% [icloud drive] [Results]

Training

To train the BADet with single GPU, run the following command:

cd mmdet/tools
python3 train.py ../configs/car_cfg.py

Inference

To evaluate the model, run the following command:

cd mmdet/tools
python3 test.py ../configs/car_cfg.py ../saved_model_vehicle/latest.pth

Acknowledgement

The code is devloped based on mmdetection, some part of codes are borrowed from SA-SSD, SECOND, and PointRCNN.

Contact

If you have questions, you can contact [email protected].

Owner
Rui Qian
Rui Qian
Official PyTorch Implementation of Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition, ICCV 2021

Official PyTorch Implementation of Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition, ICCV 2021

26 Dec 07, 2022
1st-in-MICCAI2020-CPM - Combined Radiology and Pathology Classification

Combined Radiology and Pathology Classification MICCAI 2020 Combined Radiology a

22 Dec 08, 2022
Optical machine for senses sensing using speckle and deep learning

# Senses-speckle [Remote Photonic Detection of Human Senses Using Secondary Speckle Patterns](https://doi.org/10.21203/rs.3.rs-724587/v1) paper Python

Zeev Kalyuzhner 0 Sep 26, 2021
PyTorch implementation of the cross-modality generative model that synthesizes dance from music.

Dancing to Music PyTorch implementation of the cross-modality generative model that synthesizes dance from music. Paper Hsin-Ying Lee, Xiaodong Yang,

NVIDIA Research Projects 485 Dec 26, 2022
Lucid library adapted for PyTorch

Lucent PyTorch + Lucid = Lucent The wonderful Lucid library adapted for the wonderful PyTorch! Lucent is not affiliated with Lucid or OpenAI's Clarity

Lim Swee Kiat 520 Dec 26, 2022
ZeroGen: Efficient Zero-shot Learning via Dataset Generation

ZEROGEN This repository contains the code for our paper “ZeroGen: Efficient Zero

Jiacheng Ye 31 Dec 30, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Vowpal Wabbit 8.1k Jan 06, 2023
HandTailor: Towards High-Precision Monocular 3D Hand Recovery

HandTailor This repository is the implementation code and model of the paper "HandTailor: Towards High-Precision Monocular 3D Hand Recovery" (arXiv) G

Lv Jun 113 Jan 06, 2023
On Out-of-distribution Detection with Energy-based Models

On Out-of-distribution Detection with Energy-based Models This repository contains the code for the experiments conducted in the paper On Out-of-distr

Sven 19 Aug 07, 2022
torchsummaryDynamic: support real FLOPs calculation of dynamic network or user-custom PyTorch ops

torchsummaryDynamic Improved tool of torchsummaryX. torchsummaryDynamic support real FLOPs calculation of dynamic network or user-custom PyTorch ops.

Bohong Chen 1 Jan 07, 2022
FPSAutomaticAiming——基于YOLOV5的FPS类游戏自动瞄准AI

FPSAutomaticAiming——基于YOLOV5的FPS类游戏自动瞄准AI 声明: 本项目仅限于学习交流,不可用于非法用途,包括但不限于:用于游戏外挂等,使用本项目产生的任何后果与本人无关! 简介 本项目基于yolov5,实现了一款FPS类游戏(CF、CSGO等)的自瞄AI,本项目旨在使用现

Fabian 246 Dec 28, 2022
Linear Variational State Space Filters

Linear Variational State Space Filters To set up the environment, use the provided scripts in the docker/ folder to build and run the codebase inside

0 Dec 13, 2021
Exact Pareto Optimal solutions for preference based Multi-Objective Optimization

Exact Pareto Optimal solutions for preference based Multi-Objective Optimization

Debabrata Mahapatra 40 Dec 24, 2022
ICCV2021: Code for 'Spatial Uncertainty-Aware Semi-Supervised Crowd Counting'

ICCV2021: Code for 'Spatial Uncertainty-Aware Semi-Supervised Crowd Counting'

Yanda Meng 14 May 13, 2022
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification Created by Yongming Rao, Wenliang Zhao, Benlin Liu, Jiwen Lu, Jie Zhou, Ch

Yongming Rao 414 Jan 01, 2023
NPBG++: Accelerating Neural Point-Based Graphics

[CVPR 2022] NPBG++: Accelerating Neural Point-Based Graphics Project Page | Paper This repository contains the official Python implementation of the p

Ruslan Rakhimov 57 Dec 03, 2022
Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

DiscoGAN Official PyTorch implementation of Learning to Discover Cross-Domain Relations with Generative Adversarial Networks. Prerequisites Python 2.7

SK T-Brain 754 Dec 29, 2022
DeepLab2: A TensorFlow Library for Deep Labeling

DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a unified and state-of-the-art TensorFlow codebase for dense pixel labeling tasks.

Google Research 845 Jan 04, 2023
Beyond imagenet attack (accepted by ICLR 2022) towards crafting adversarial examples for black-box domains.

Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains (ICLR'2022) This is the Pytorch code for our paper Beyond ImageNet

Alibaba-AAIG 37 Nov 23, 2022
Polynomial-time Meta-Interpretive Learning

Louise - polynomial-time Program Learning Getting help with Louise Louise's author can be reached by email at Stassa Patsantzis 64 Dec 26, 2022