RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

Last update: Nov 29, 2022

Related tags

Deep Learning RTS3D

Overview

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving (AAAI2021).

RTS3D is efficiency and accuracy stereo 3D object detection method for autonomous driving.

RTS3D

Introduction

RTS3D is the first true real-time system (FPS>24) for stereo image 3D detection meanwhile achieves 10% improvement in average precision comparing with the previous state-of-the-art method. RTS3D only require RGB images without synthetic data, instance segmentation, CAD model, or depth generator.

Highlights

Fast: 33 FPS of single image test speed in KITTI benchmark with 384*1280 resolution
Accuracy: SOTA on the KITTI benchmark.
Anchor Free: No 2D or 3D anchor are reauired
Easy to deploy: RTS3D uses conventional convolution operations and MLP, so it is very easy to deploy and accelerate.

RTS3D Baseline and Model Zoo

All experiments are tested with Ubuntu 16.04, Pytorch 1.0.0, CUDA 9.0, Python 3.6, single NVIDIA 2080Ti

IoU Setting 1: Car IoU > 0.5, Pedestrian IoU > 0.25, Cyclist IoU > 0.25

IoU Setting 2: Car IoU > 0.7, Pedestrian IoU > 0.5, Cyclist IoU > 0.5

Training on KITTI train split and evaluation on val split.
- FCE Space Resolution: 10 * 10 * 10
- Model: (Google Drive), (Baidu Cloud 提取码：k4uk)

Class	Iteration	FPS	AP BEV IoU Setting1	AP 3D IoU Setting1	AP BEV IoU Setting2	AP 3D IoU Setting2
-	-	-	Easy / Moderate / Hard	Easy / Moderate / Hard	Easy / Moderate / Hard	Easy / Moderate / Hard
Car- Recall-11	1	90.9	89.83, 77.05, 68.28	89.27, 70.12, 61.17	73.20, 53.62, 46.44	60.87, 42.38, 36.44
Car- Recall-40	1	90.9	92.92, 76.17, 66.62	90.35, 71.37, 63.52	78.12, 54.75, 47.09	60.34, 39.32, 32.97
Car- Recall-11	2	45.5	90.41, 78.70, 70.03	90.26, 77.23, 68.28	76.56, 56.46, 48.20	63.65, 44.50, 37.48
Car- Recall-40	2	45.5	95.75, 79.61, 69.69	93.57, 76.64, 66.72	78.12, 54.75, 47.09	63.99, 41.78, 34.96

Training on KITTI train split and evaluation on val split.
- FCE Space Resolution: 10 * 10 * 10
- Recall split: 11
- Iteration: 2
- Model: (Google Drive), (Baidu Cloud 提取码：4t4u)

Class	AP BEV IoU Setting1	AP 3D IoU Setting1	AP BEV IoU Setting2	AP 3D IoU Setting2
-	Easy / Moderate / Hard	Easy / Moderate / Hard	Easy / Moderate / Hard	Easy / Moderate / Hard
Car	90.18, 78.46, 69.76	89.88, 76.64, 67.86	74.95, 54.07, 46.78	58.50, 39.74, 34.83
Pedestrian	57.12, 48.82, 40.88	56.36, 48.29, 40.22	32.16, 26.31, 21.28	26.95, 20.77, 19.74
Cyclist	54.48, 35.78, 30.80	53.86, 30.90, 30.52	33.59, 20.80, 20.14	31.05, 20.26, 18.93

Installation

Please refer to INSTALL.md

Dataset preparation

Please download the official KITTI 3D object detection dataset and organize the downloaded files as follows:

KM3DNet
├── kitti_format
│   ├── data
│   │   ├── kitti
│   │   |   ├── annotations
│   │   │   ├── calib /000000.txt .....
│   │   │   ├── image(left[0-7480] right[7481-14961] input augmentatiom)
│   │   │   ├── label /000000.txt .....
|   |   |   ├── train.txt val.txt trainval.txt
│   │   │   ├── mono_results /000000.txt .....
├── src
├── demo_kitti_format
├── readme
├── requirements.txt

Getting Started

Please refer to GETTING_STARTED.md to learn more usage about this project.

Acknowledgement

License

RTS3D is released under the MIT License (refer to the LICENSE file for details). Portions of the code are borrowed from, CenterNet, iou3d and kitti_eval (KITTI dataset evaluation). Please refer to the original License of these projects (See NOTICE).

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@misc{2012.15072,
Author = {Peixuan Li, Shun Su, Huaici Zhao},
Title = {RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving},
Year = {2020},
Eprint = {arXiv:2012.15072},
}

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

Related tags

Overview

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving (AAAI2021).

Introduction

Highlights

RTS3D Baseline and Model Zoo

Installation

Dataset preparation

Getting Started

Acknowledgement

License

Citation

Owner

Network Compression via Central Filter

pip install python-office

Implementation of the "Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos" paper.

A collection of resources, problems, explanations and concepts that are/were important during my Data Science journey

The open source code of SA-UNet: Spatial Attention U-Net for Retinal Vessel Segmentation.

Dirty Pixels: Towards End-to-End Image Processing and Perception

PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in clustering (CVPR2021)

Object detection and instance segmentation toolkit based on PaddlePaddle.

A pure PyTorch implementation of the loss described in "Online Segment to Segment Neural Transduction"

A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

Code for "PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation" CVPR 2019 oral

Vehicle speed detection with python

DeepProbLog is an extension of ProbLog that integrates Probabilistic Logic Programming with deep learning by introducing the neural predicate.

Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

MinkLoc3D-SI: 3D LiDAR place recognition with sparse convolutions,spherical coordinates, and intensity

PyTorch implementation for the paper Pseudo Numerical Methods for Diffusion Models on Manifolds

Vehicle direction identification consists of three module detection , tracking and direction recognization.

SemiNAS: Semi-Supervised Neural Architecture Search

A best practice for tensorflow project template architecture.

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network