PyTorch Implementation of PIXOR: Real-time 3D Object Detection from Point Clouds

Overview

PIXOR: Real-time 3D Object Detection from Point Clouds

This is a custom implementation of the paper from Uber ATG using PyTorch 1.0. It represents the driving scene using lidar data in the Birds' Eye View (BEV) and uses a single stage object detector to predict the poses of road objects with respect to the car

PIXOR: Real-time 3D Object Detection from Point Clouds

alt text

Highlights

  • PyTorch 1.0 Reproduced and trained from scratch using the KITTI dataset
  • Fast Custom LiDAR preprocessing using C++
  • Multi-GPU Training and Pytorch MultiProcessing package to speed up non-maximum suppression during evaluation
  • Tensorboard Visualize trainig progress using Tensorboard
  • KITTI and ROSBAG Demo Scripts that supports running inferences directly on raw KITTI data or custom rosbags.

Install

Dependencies:

  • Python 3.5(3.6)
  • Pytorch (Follow Official Installation Guideline)
  • Tensorflow (see their website)
  • Numpy, MatplotLib, OpenCV3
  • PyKitti (for running on KITTI raw dataset)
  • gcc
pip install shapely numpy matplotlib
git clone https://github.com/philip-huang/PIXOR
cd PIXOR/srcs/preprocess
make

(Optional) If you want to run this project on a custom rosbag containing Velodyne HDL64 scans the system must be Linux with ROS kinetic installed. You also need to install the velodyne driver into the velodyne_ws folder.

Set up the velodyne workspace by running ./velodyne_setup.bash and press Ctrl-C as necessary.

Demo

A helper class is provided in run_kitti.py to simplify writing inference pipelines using pre-trained models. Here is how we would do it. Run this from the src folder (suppose I have already downloaded my KITTI raw data and extracted to somewhere)

from run_kitti import *

def make_kitti_video():
     
    basedir = '/mnt/ssd2/od/KITTI/raw'
    date = '2011_09_26'
    drive = '0035'
    dataset = pykitti.raw(basedir, date, drive)
   
    videoname = "detection_{}_{}.avi".format(date, drive)
    save_path = os.path.join(basedir, date, "{}_drive_{}_sync".format(date, drive), videoname)    
    run(dataset, save_path)

make_kitti_video()

Training and Evaluation

Our Training Result (as of Dec 2018) alt text

All configuration (hyperparameters, GPUs, etc) should be put in a config.json file and save to the directory srcs/experiments/$exp_name$ To train

python srcs/main.py train (--name=$exp_name$)

To evaluate an experiment

python srcs/main.py val (--name=$exp_name$)

To display a sample result

python srcs/main.py test --name=$exp_name$

To view tensorboard

tensorboard --logdir=srcs/logs/$exp_name$

TODO

  • Improve training accuracy on KITTI dataset
  • Data augmentation
  • Generalization gap on custom driving sequences
  • Data Collection
  • Improve model (possible idea: use map as a prior)

Credits

Project Contributors

  • Philip Huang
  • Allan Liu

Paper Citation below



@inproceedings{yang2018pixor,
  title={PIXOR: Real-Time 3D Object Detection From Point Clouds},
  author={Yang, Bin and Luo, Wenjie and Urtasun, Raquel}
}

We would like to thank aUToronto for genersouly sponsoring GPUs for this project

Owner
Philip Huang
University of Toronto | Engineering Science | Machine Intelligence
Philip Huang
Source code for CVPR 2021 paper "Riggable 3D Face Reconstruction via In-Network Optimization"

Riggable 3D Face Reconstruction via In-Network Optimization Source code for CVPR 2021 paper "Riggable 3D Face Reconstruction via In-Network Optimizati

130 Jan 02, 2023
Localized representation learning from Vision and Text (LoVT)

Localized Vision-Text Pre-Training Contrastive learning has proven effective for pre- training image models on unlabeled data and achieved great resul

Philip Müller 10 Dec 07, 2022
MPI Interest Group on Algorithms on 1st semester 2021

MPI Algorithms Interest Group Introduction Lecturer: Steve Yan Location: TBA Time Schedule: TBA Semester: 1 Useful URLs Typora: https://typora.io Goog

Ex10si0n 13 Sep 08, 2022
Official PyTorch code of DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization (ICCV 2021 Oral).

DeepPanoContext (DPC) [Project Page (with interactive results)][Paper] DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context G

Cheng Zhang 66 Nov 16, 2022
Open-L2O: A Comprehensive and Reproducible Benchmark for Learning to Optimize Algorithms

Open-L2O This repository establishes the first comprehensive benchmark efforts of existing learning to optimize (L2O) approaches on a number of proble

VITA 161 Jan 02, 2023
An Official Repo of CVPR '20 "MSeg: A Composite Dataset for Multi-Domain Segmentation"

This is the code for the paper: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation (CVPR 2020, Official Repo) [CVPR PDF] [Journal PDF] J

226 Nov 05, 2022
A Python 3 package for state-of-the-art statistical dimension reduction methods

direpack: a Python 3 library for state-of-the-art statistical dimension reduction techniques This package delivers a scikit-learn compatible Python 3

Sven Serneels 32 Dec 14, 2022
Machine learning for NeuroImaging in Python

nilearn Nilearn enables approachable and versatile analyses of brain volumes. It provides statistical and machine-learning tools, with instructive doc

919 Dec 25, 2022
Retinal vessel segmentation based on GT-UNet

Retinal vessel segmentation based on GT-UNet Introduction This project is a retinal blood vessel segmentation code based on UNet-like Group Transforme

Kent0n 27 Dec 18, 2022
GLaRA: Graph-based Labeling Rule Augmentation for Weakly Supervised Named Entity Recognition

GLaRA: Graph-based Labeling Rule Augmentation for Weakly Supervised Named Entity Recognition

Xinyan Zhao 29 Dec 26, 2022
[ICCV'21] UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction

UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction Project Page | Paper | Supplementary | Video This reposit

331 Dec 28, 2022
Official Pytorch Implementation of Unsupervised Image Denoising with Frequency Domain Knowledge

Unsupervised Image Denoising with Frequency Domain Knowledge (BMVC 2021 Oral) : Official Project Page This repository provides the official PyTorch im

Donggon Jang 12 Sep 26, 2022
Forecasting directional movements of stock prices for intraday trading using LSTM and random forest

Forecasting directional movements of stock-prices for intraday trading using LSTM and random-forest https://arxiv.org/abs/2004.10178 Pushpendu Ghosh,

Pushpendu Ghosh 270 Dec 24, 2022
The end-to-end platform for building voice products at scale

Picovoice Made in Vancouver, Canada by Picovoice Picovoice is the end-to-end platform for building voice products on your terms. Unlike Alexa and Goog

Picovoice 318 Jan 07, 2023
Open source implementation of "A Self-Supervised Descriptor for Image Copy Detection" (SSCD).

A Self-Supervised Descriptor for Image Copy Detection (SSCD) This is the open-source codebase for "A Self-Supervised Descriptor for Image Copy Detecti

Meta Research 68 Jan 04, 2023
Oriented Object Detection: Oriented RepPoints + Swin Transformer/ReResNet

Oriented RepPoints for Aerial Object Detection The code for the implementation of “Oriented RepPoints + Swin Transformer/ReResNet”. Introduction Based

96 Dec 13, 2022
CVPR 2020 oral paper: Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax.

Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax ⚠️ Latest: Current repo is a complete version. But we delet

FishYuLi 341 Dec 23, 2022
NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR2021)

NExT-QA We reproduce some SOTA VideoQA methods to provide benchmark results for our NExT-QA dataset accepted to CVPR2021 (with 1 'Strong Accept' and 2

Junbin Xiao 50 Nov 24, 2022
Code for paper Adaptively Aligned Image Captioning via Adaptive Attention Time

Adaptively Aligned Image Captioning via Adaptive Attention Time This repository includes the implementation for Adaptively Aligned Image Captioning vi

Lun Huang 45 Aug 27, 2022
This repo generates the training data and the model for Morpheus-Deblend

Morpheus-Deblend This repo generates the training data and the model for Morpheus-Deblend. This is the active development repo for the project and as

Ryan Hausen 2 Apr 18, 2022