Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21

Last update: Dec 06, 2022

Related tags

Deep Learning MonoFlex

Overview

MonoFlex

Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21.

Work in progress.

Installation

This repo is tested with Ubuntu 20.04, python==3.7, pytorch==1.4.0 and cuda==10.1

conda create -n monoflex python=3.7

conda activate monoflex

Install PyTorch and other dependencies:

conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

pip install -r requirements.txt

Build DCNv2 and the project

cd models/backbone/DCNv2

. make.sh

cd ../../..

python setup develop

Data Preparation

Please download KITTI dataset and organize the data as follows:

#ROOT		
  |training/
    |calib/
    |image_2/
    |label/
    |ImageSets/
  |testing/
    |calib/
    |image_2/
    |ImageSets/

Then modify the paths in config/paths_catalog.py according to your data path.

Training & Evaluation

Training with one GPU. (TODO: The multi-GPU training will be further tested.)

CUDA_VISIBLE_DEVICES=0 python tools/plain_train_net.py --batch_size 8 --config runs/monoflex.yaml --output output/exp

The model will be evaluated periodically (can be adjusted in the CONFIG) during training and you can also evaluate a checkpoint with

CUDA_VISIBLE_DEVICES=0 python tools/plain_train_net.py --config runs/monoflex.yaml --ckpt YOUR_CKPT  --eval

You can also specify --vis when evaluation to visualize the predicted heatmap and 3D bounding boxes. The pretrained model for train/val split and logs are here.

Note: we observe an obvious variation of the performance for different runs and we are still investigating possible solutions to stablize the results, though it may inevitably due to the utilized uncertainties.

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{MonoFlex,
    author    = {Zhang, Yunpeng and Lu, Jiwen and Zhou, Jie},
    title     = {Objects Are Different: Flexible Monocular 3D Object Detection},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {3289-3298}
}

Acknowlegment

The code is heavily borrowed from SMOKE and thanks for their contribution.

Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21

Related tags

Overview

MonoFlex

Installation

Data Preparation

Training & Evaluation

Citation

Acknowlegment

Owner

Yunpeng

[제 13회 투빅스 컨퍼런스] OK Mugle! - 장르부터 멜로디까지, Content-based Music Recommendation

SHIFT15M: multiobjective large-scale fashion dataset with distributional shifts

Code for LIGA-Stereo Detector, ICCV'21

Deep Learning Pipelines for Apache Spark

Code for the paper "Benchmarking and Analyzing Point Cloud Classification under Corruptions"

Generate image analogies using neural matching and blending

Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks

Super-BPD: Super Boundary-to-Pixel Direction for Fast Image Segmentation (CVPR 2020)

DumpSMBShare - A script to dump files and folders remotely from a Windows SMB share

[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

High performance distributed framework for training deep learning recommendation models based on PyTorch.

BookMyShowPC - Movie Ticket Reservation App made with Tkinter

Metric learning algorithms in Python

A numpy-based implementation of RANSAC for fundamental matrix and homography estimation. The degeneracy updating and local optimization components are included and optional.

RobustVideoMatting and background composing in one model by using onnxruntime.

Source code and data from the RecSys 2020 article "Carousel Personalization in Music Streaming Apps with Contextual Bandits" by W. Bendada, G. Salha and T. Bontempelli

Yet another video caption

TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification

TensorFlow implementation of the paper "Hierarchical Attention Networks for Document Classification"