Pseudo lidar - (CVPR 2019) Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving

Overview

Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving

This paper has been accpeted by Conference on Computer Vision and Pattern Recognition (CVPR) 2019.

Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving

by Yan Wang, Wei-Lun Chao, Divyansh Garg, Bharath Hariharan, Mark Campbell and Kilian Q. Weinberger

Figure

Citation

@inproceedings{wang2019pseudo,
  title={Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving},
  author={Wang, Yan and Chao, Wei-Lun and Garg, Divyansh and Hariharan, Bharath and Campbell, Mark and Weinberger, Kilian},
  booktitle={CVPR},
  year={2019}
}

Update

  • 2nd July 2020: Add a jupyter script to visualize point cloud. It is in ./visualization folder.
  • 29th July 2019: submission.py will save the disparity to the numpy file, not png file. And fix the generate_lidar.py.
  • I have modifed the official avod a little bit. Now you can directly train and test pseudo-lidar with avod. Please check the code https://github.com/mileyan/avod_pl.

Contents

Introduction

3D object detection is an essential task in autonomous driving. Recent techniques excel with highly accurate detection rates, provided the 3D input data is obtained from precise but expensive LiDAR technology. Approaches based on cheaper monocular or stereo imagery data have, until now, resulted in drastically lower accuracies --- a gap that is commonly attributed to poor image-based depth estimation. However, in this paper we argue that data representation (rather than its quality) accounts for the majority of the difference. Taking the inner workings of convolutional neural networks into consideration, we propose to convert image-based depth maps to pseudo-LiDAR representations --- essentially mimicking LiDAR signal. With this representation we can apply different existing LiDAR-based detection algorithms. On the popular KITTI benchmark, our approach achieves impressive improvements over the existing state-of-the-art in image-based performance --- raising the detection accuracy of objects within 30m range from the previous state-of-the-art of 22% to an unprecedented 74%. At the time of submission our algorithm holds the highest entry on the KITTI 3D object detection leaderboard for stereo image based approaches.

Usage

1. Overview

We provide the guidance and codes to train stereo depth estimator and 3D object detector using the KITTI object detection benchmark. We also provide our pre-trained models.

2. Stereo depth estimation models

We provide our pretrained PSMNet model using the Scene Flow dataset and the 3,712 training images of the KITTI detection benchmark.

We also directly provide the pseudo-LiDAR point clouds and the ground planes of training and testing images estimated by this pre-trained model.

We also provide codes to train your own stereo depth estimator and prepare the point clouds and gound planes. If you want to use our pseudo-LiDAR data for 3D object detection, you may skip the following contents and directly move on to object detection models.

2.1 Dependencies

  • Python 3.5+
  • numpy, scikit-learn, scipy
  • KITTI 3D object detection dataset

2.2 Download the dataset

You need to download the KITTI dataset from here, including left and right color images, Velodyne point clouds, camera calibration matrices, and training labels. You also need to download the image set files from here. Then you need to organize the data in the following way.

KITTI/object/
    
    train.txt
    val.txt
    test.txt 
    
    training/
        calib/
        image_2/ #left image
        image_3/ #right image
        label_2/
        velodyne/ 

    testing/
        calib/
        image_2/
        image_3/
        velodyne/

The Velodyne point clouds (by LiDAR) are used ONLY as the ground truths to train a stereo depth estimator (e.g., PSMNet).

2.3 Generate ground-truth image disparities

Use the script./preprocessing/generate_disp.py to process all velodyne files appeared in train.txt. This is our training ground truth. Or you can directly download them from disparity. Name this folder as disparity and put it inside the training folder.

python generate_disp.py --data_path ./KITTI/object/training/ --split_file ./KITTI/object/train.txt 

2.4. Train the stereo model

You can train any stereo disparity model as you want. Here we give an example to train the PSMNet. The modified code is saved in the subfolder psmnet. Make sure you follow the README inside this folder to install the correct python and library. I strongly suggest using conda env to organize the python environments since we will use Python with different versions. Download the psmnet model pretrained on Sceneflow dataset from here.

# train psmnet with 4 TITAN X GPUs.
python ./psmnet/finetune_3d.py --maxdisp 192 \
     --model stackhourglass \
     --datapath ./KITTI/object/training/ \
     --split_file ./KITTI/object/train.txt \
     --epochs 300 \
     --lr_scale 50 \
     --loadmodel ./pretrained_sceneflow.tar \
     --savemodel ./psmnet/kitti_3d/  --btrain 12

2.5 Predict the point clouds

Predict the disparities.
# training
python ./psmnet/submission.py \
    --loadmodel ./psmnet/kitti_3d/finetune_300.tar \
    --datapath ./KITTI/object/training/ \
    --save_path ./KITTI/object/training/predict_disparity
# testing
python ./psmnet/submission.py \
    --loadmodel ./psmnet/kitti_3d/finetune_300.tar \
    --datapath ./KITTI/object/testing/ \
    --save_path ./KITTI/object/testing/predict_disparity
Convert the disparities to point clouds.
# training
python ./preprocessing/generate_lidar.py  \
    --calib_dir ./KITTI/object/training/calib/ \
    --save_dir ./KITTI/object/training/pseudo-lidar_velodyne/ \
    --disparity_dir ./KITTI/object/training/predict_disparity \
    --max_high 1
# testing
python ./preprocessing/generate_lidar.py  \
    --calib_dir ./KITTI/object/testing/calib/ \
    --save_dir ./KITTI/object/testing/pseudo-lidar_velodyne/ \
    --disparity_dir ./KITTI/object/testing/predict_disparity \
    --max_high 1

If you want to generate point cloud from depth map (like DORN), you can add --is_depth in the command.

2.6 Generate ground plane

If you want to train an AVOD model for 3D object detection, you need to generate ground planes from pseudo-lidar point clouds.

#training
python ./preprocessing/kitti_process_RANSAC.py \
    --calib ./KITTI/object/training/calib/ \
    --lidar_dir  ./KITTI/object/training/pseudo-lidar_velodyne/ \
    --planes_dir /KITTI/object/training/pseudo-lidar_planes/
#testing
python ./preprocessing/kitti_process_RANSAC.py \
    --calib ./KITTI/object/testing/calib/ \
    --lidar_dir  ./KITTI/object/testing/pseudo-lidar_velodyne/ \
    --planes_dir /KITTI/object/testing/pseudo-lidar_planes/

3. Object Detection models

AVOD model

Download the code from https://github.com/kujason/avod and install the Python dependencies.

Follow their README to prepare the data and then replace (1) files in velodyne with those in pseudo-lidar_velodyne and (2) files in planes with those in pseudo-lidar_planes. Note that you should still keep the folder names as velodyne and planes.

Follow their README to train the pyramid_cars_with_aug_example model. You can also download our pretrained model and directly evaluate on it. But if you want to submit your result to the leaderboard, you need to train it on trainval.txt.

Frustum-PointNets model

Download the code from https://github.com/charlesq34/frustum-pointnets and install the Python dependencies.

Follow their README to prepare the data and then replace files in velodyne with those in pseudo-lidar_velodyne. Note that you should still keep the folder name as velodyne.

Follow their README to train the v1 model. You can also download our pretrained model and directly evaluate on it.

Results

The main results on the validation dataset of our pseudo-LiDAR method. Figure

You can download the avod validation results from HERE.

Contact

If you have any question, please feel free to email us.

Yan Wang ([email protected]), Harry Chao([email protected]), Div Garg([email protected])

Towards End-to-end Video-based Eye Tracking

Towards End-to-end Video-based Eye Tracking The code accompanying our ECCV 2020 publication and dataset, EVE. Authors: Seonwook Park, Emre Aksan, Xuco

Seonwook Park 76 Dec 12, 2022
sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code

sequitur sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code. It implements three differ

Jonathan Shobrook 305 Dec 21, 2022
Official Repo for ICCV2021 Paper: Learning to Regress Bodies from Images using Differentiable Semantic Rendering

[ICCV2021] Learning to Regress Bodies from Images using Differentiable Semantic Rendering Getting Started DSR has been implemented and tested on Ubunt

Sai Kumar Dwivedi 83 Nov 27, 2022
Official Pytorch implementation of the paper "MotionCLIP: Exposing Human Motion Generation to CLIP Space"

MotionCLIP Official Pytorch implementation of the paper "MotionCLIP: Exposing Human Motion Generation to CLIP Space". Please visit our webpage for mor

Guy Tevet 173 Dec 26, 2022
Python PID Tuner - Makes a model of the System from a Process Reaction Curve and calculates PID Gains

PythonPID_Tuner_SOPDT Step 1: Takes a Process Reaction Curve in csv format - assumes data at 100ms interval (column names CV and PV) Step 2: Makes a r

1 Jan 18, 2022
A unet implementation for Image semantic segmentation

Unet-pytorch a unet implementation for Image semantic segmentation 参考网上的Unet做分割的代码,做了一个针对kaggle地盐识别的,请去以下地址获取数据集: https://www.kaggle.com/c/tgs-salt-id

Rabbit 3 Jun 29, 2022
Bare bones use-case for deploying a containerized web app (built in streamlit) on AWS.

Containerized Streamlit web app This repository is featured in a 3-part series on Deploying web apps with Streamlit, Docker, and AWS. Checkout the blo

Collin Prather 62 Jan 02, 2023
Definition of a business problem according to Wilson Lower Bound Score and Time Based Average Rating

Wilson Lower Bound Score, Time Based Rating Average In this study I tried to calculate the product rating and sorting reviews more accurately. I have

3 Sep 30, 2021
potpourri3d - An invigorating blend of 3D geometry tools in Python.

A Python library of various algorithms and utilities for 3D triangle meshes and point clouds. Managed by Nicholas Sharp, with new tools added lazily as needed. Currently, mainly bindings to C++ tools

Nicholas Sharp 295 Jan 05, 2023
Official PyTorch implementation of the paper "Deep Constrained Least Squares for Blind Image Super-Resolution", CVPR 2022.

Deep Constrained Least Squares for Blind Image Super-Resolution [Paper] This is the official implementation of 'Deep Constrained Least Squares for Bli

MEGVII Research 141 Dec 30, 2022
The official implementation code of "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction."

PlantStereo This is the official implementation code for the paper "PlantStereo: A Stereo Matching Benchmark for Plant Surface Dense Reconstruction".

Wang Qingyu 14 Nov 28, 2022
Deep Learning for Time Series Forecasting.

nixtlats:Deep Learning for Time Series Forecasting [nikstla] (noun, nahuatl) Period of time. State-of-the-art time series forecasting for pytorch. Nix

Nixtla 5 Dec 06, 2022
Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

Face Identity Disentanglement via Latent Space Mapping Description Official Implementation of the paper Face Identity Disentanglement via Latent Space

150 Dec 07, 2022
PatrickStar enables Larger, Faster, Greener Pretrained Models for NLP. Democratize AI for everyone.

PatrickStar: Parallel Training of Large Language Models via a Chunk-based Memory Management Meeting PatrickStar Pre-Trained Models (PTM) are becoming

Tencent 633 Dec 28, 2022
Chainer implementation of recent GAN variants

Chainer-GAN-lib This repository collects chainer implementation of state-of-the-art GAN algorithms. These codes are evaluated with the inception score

399 Oct 23, 2022
Multi-Modal Machine Learning toolkit based on PyTorch.

简体中文 | English TorchMM 简介 多模态学习工具包 TorchMM 旨在于提供模态联合学习和跨模态学习算法模型库,为处理图片文本等多模态数据提供高效的解决方案,助力多模态学习应用落地。 近期更新 2022.1.5 发布 TorchMM 初始版本 v1.0 特性 丰富的任务场景:工具

njustkmg 1 Jan 05, 2022
Code for the ICME 2021 paper "Exploring Driving-Aware Salient Object Detection via Knowledge Transfer"

TSOD Code for the ICME 2021 paper "Exploring Driving-Aware Salient Object Detection via Knowledge Transfer" Usage For training, open train_test, run p

Jinming Su 2 Dec 23, 2021
A C implementation for creating 2D voronoi diagrams

Branch OSX/Linux Windows master dev jc_voronoi A fast C/C++ header only implementation for creating 2D Voronoi diagrams from a point set Uses Fortune'

Mathias Westerdahl 481 Dec 29, 2022
This program automatically runs Python code copied in clipboard

CopyRun This program runs Python code which is copied in clipboard WARNING!! USE AT YOUR OWN RISK! NO GUARANTIES IF ANYTHING GETS BROKEN. DO NOT COPY

vertinski 4 Sep 10, 2021
Codes to calculate solar-sensor zenith and azimuth angles directly from hyperspectral images collected by UAV. Works only for UAVs that have high resolution GNSS/IMU unit.

UAV Solar-Sensor Angle Calculation Table of Contents About The Project Built With Getting Started Prerequisites Installation Datasets Contributing Lic

Sourav Bhadra 1 Jan 15, 2022