RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

Last update: Nov 29, 2022

Related tags

Deep Learning RTS3D

Overview

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving (AAAI2021).

RTS3D is efficiency and accuracy stereo 3D object detection method for autonomous driving.

RTS3D

Introduction

RTS3D is the first true real-time system (FPS>24) for stereo image 3D detection meanwhile achieves 10% improvement in average precision comparing with the previous state-of-the-art method. RTS3D only require RGB images without synthetic data, instance segmentation, CAD model, or depth generator.

Highlights

Fast: 33 FPS of single image test speed in KITTI benchmark with 384*1280 resolution
Accuracy: SOTA on the KITTI benchmark.
Anchor Free: No 2D or 3D anchor are reauired
Easy to deploy: RTS3D uses conventional convolution operations and MLP, so it is very easy to deploy and accelerate.

RTS3D Baseline and Model Zoo

All experiments are tested with Ubuntu 16.04, Pytorch 1.0.0, CUDA 9.0, Python 3.6, single NVIDIA 2080Ti

IoU Setting 1: Car IoU > 0.5, Pedestrian IoU > 0.25, Cyclist IoU > 0.25

IoU Setting 2: Car IoU > 0.7, Pedestrian IoU > 0.5, Cyclist IoU > 0.5

Training on KITTI train split and evaluation on val split.
- FCE Space Resolution: 10 * 10 * 10
- Model: (Google Drive), (Baidu Cloud 提取码：k4uk)

Class	Iteration	FPS	AP BEV IoU Setting1	AP 3D IoU Setting1	AP BEV IoU Setting2	AP 3D IoU Setting2
-	-	-	Easy / Moderate / Hard	Easy / Moderate / Hard	Easy / Moderate / Hard	Easy / Moderate / Hard
Car- Recall-11	1	90.9	89.83, 77.05, 68.28	89.27, 70.12, 61.17	73.20, 53.62, 46.44	60.87, 42.38, 36.44
Car- Recall-40	1	90.9	92.92, 76.17, 66.62	90.35, 71.37, 63.52	78.12, 54.75, 47.09	60.34, 39.32, 32.97
Car- Recall-11	2	45.5	90.41, 78.70, 70.03	90.26, 77.23, 68.28	76.56, 56.46, 48.20	63.65, 44.50, 37.48
Car- Recall-40	2	45.5	95.75, 79.61, 69.69	93.57, 76.64, 66.72	78.12, 54.75, 47.09	63.99, 41.78, 34.96

Training on KITTI train split and evaluation on val split.
- FCE Space Resolution: 10 * 10 * 10
- Recall split: 11
- Iteration: 2
- Model: (Google Drive), (Baidu Cloud 提取码：4t4u)

Class	AP BEV IoU Setting1	AP 3D IoU Setting1	AP BEV IoU Setting2	AP 3D IoU Setting2
-	Easy / Moderate / Hard	Easy / Moderate / Hard	Easy / Moderate / Hard	Easy / Moderate / Hard
Car	90.18, 78.46, 69.76	89.88, 76.64, 67.86	74.95, 54.07, 46.78	58.50, 39.74, 34.83
Pedestrian	57.12, 48.82, 40.88	56.36, 48.29, 40.22	32.16, 26.31, 21.28	26.95, 20.77, 19.74
Cyclist	54.48, 35.78, 30.80	53.86, 30.90, 30.52	33.59, 20.80, 20.14	31.05, 20.26, 18.93

Installation

Please refer to INSTALL.md

Dataset preparation

Please download the official KITTI 3D object detection dataset and organize the downloaded files as follows:

KM3DNet
├── kitti_format
│   ├── data
│   │   ├── kitti
│   │   |   ├── annotations
│   │   │   ├── calib /000000.txt .....
│   │   │   ├── image(left[0-7480] right[7481-14961] input augmentatiom)
│   │   │   ├── label /000000.txt .....
|   |   |   ├── train.txt val.txt trainval.txt
│   │   │   ├── mono_results /000000.txt .....
├── src
├── demo_kitti_format
├── readme
├── requirements.txt

Getting Started

Please refer to GETTING_STARTED.md to learn more usage about this project.

Acknowledgement

License

RTS3D is released under the MIT License (refer to the LICENSE file for details). Portions of the code are borrowed from, CenterNet, iou3d and kitti_eval (KITTI dataset evaluation). Please refer to the original License of these projects (See NOTICE).

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@misc{2012.15072,
Author = {Peixuan Li, Shun Su, Huaici Zhao},
Title = {RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving},
Year = {2020},
Eprint = {arXiv:2012.15072},
}

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

Related tags

Overview

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving (AAAI2021).

Introduction

Highlights

RTS3D Baseline and Model Zoo

Installation

Dataset preparation

Getting Started

Acknowledgement

License

Citation

Owner

Code for AutoNL on ImageNet (CVPR2020)

EigenGAN Tensorflow, EigenGAN: Layer-Wise Eigen-Learning for GANs

Official implementation of "Motif-based Graph Self-Supervised Learning forMolecular Property Prediction"

DiffStride: Learning strides in convolutional neural networks

pyhsmm - library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and explicit-duration Hidden semi-Markov Models (HSMMs), focusing on the Bayesian Nonparametric extensions, the HDP-HMM and HDP-HSMM, mostly with weak-limit approximations.

Finding all things on-prem Microsoft for password spraying and enumeration.

Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image

Repository providing a wide range of self-supervised pretrained models for computer vision tasks.

SBINN: Systems-biology informed neural network

Code for "On Memorization in Probabilistic Deep Generative Models"

🔎 Monitor deep learning model training and hardware usage from your mobile phone 📱

Supporting code for "Autoregressive neural-network wavefunctions for ab initio quantum chemistry".

This is a yolo3 implemented via tensorflow 2.7

[NeurIPS-2020] Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID.

The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"

Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes, ICCV 2017

Visualizer for neural network, deep learning, and machine learning models

Image-to-Image Translation in PyTorch

Deep Learning Head Pose Estimation using PyTorch.

Simple renderer for use with MuJoCo (>=2.1.2) Python Bindings.