The source code of "SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation", accepted to WACV 2022.

Last update: Dec 18, 2022

Related tags

Deep Learning SIDE

Overview

SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation

The source code of our work "SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation", accepted to WACV 2022

Installation

Requirements

Python >= 3.6
PyTorch >= 1.2
COCOAPI
DCNv2

Data Preparation

KITTI

Download the train-val split of 3DOP and SubCNN and place the data as below

  ${SIDE_ROOT}
  |-- data
  `-- |-- kitti
      `-- |-- training
          |   |-- image_2
          |   |-- label_2
          |   |-- calib
          |-- ImageSets_3dop
          |   |-- test.txt
          |   |-- train.txt
          |   |-- val.txt
          |   |-- trainval.txt
          `-- ImageSets_subcnn
              |-- test.txt
              |-- train.txt
              |-- val.txt
              |-- trainval.txt

Training

To train the kitti 3D object detection with dla on 4 GPUs, run

python testTrain.py stereo --exp_id sub_dla34 --dataset kitti --kitti_split subcnn --batch_size 16 --num_epochs 70 --lr_step 45,60 --gpus 0,1,2,3

By default, pytorch evenly splits the total batch size to each GPUs. --master_batch allows using different batchsize for the master GPU, which usually costs more memory than other GPUs. If it encounters GPU memory out, using slightly less batch size with the same learning is fine.

If the training is terminated before finishing, you can use the same commond with --resume to resume training. It will found the lastest model with the same exp_id.

Evaluation

To evaluate the kitti dataset, first compile the evaluation tool (from here):

cd SIDE_ROOT/src/tools/kitti_eval
g++ -o evaluate_object_3d_offline evaluate_object_3d_offline.cpp -O3

Then run the evaluation with pretrained model:

python testVal.py stereo --exp_id sub_dla34 --dataset kitti --kitti_split 3dop --resume

to evaluate the 3DOP split. For the subcnn split, change --kitti_split to subcnn and load the corresponding models.

License

SIDE itself is released under the MIT License (refer to the LICENSE file for details). Portions of the code are borrowed from CenterNet(anchor-free design), Stereo-RCNN(geometric constraint), DCNv2(deformable convolutions) and kitti_eval (KITTI dataset evaluation). Please refer to the original License of these projects (See NOTICE).

Reference

If you find our work useful in your research, please consider citing our paper:

@article{peng2021side,
  title={SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation},
  author={Peng, Xidong and Zhu, Xinge and Wang, Tai and Ma, Yuexin},
  journal={arXiv preprint arXiv:2108.09663},
  year={2021}
}

The source code of "SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation", accepted to WACV 2022.

Related tags

Overview

SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation

Installation

Requirements

Data Preparation

KITTI

Training

Evaluation

License

Reference

Owner

Securetar - A streaming wrapper around python tarfile and allow secure handling files and support encryption

This is the research repository for Vid2Doppler: Synthesizing Doppler Radar Data from Videos for Training Privacy-Preserving Activity Recognition.

Exploring Relational Context for Multi-Task Dense Prediction [ICCV 2021]

PyTorch code of "SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks"

PyTorch implementation of Octave Convolution with pre-trained Oct-ResNet and Oct-MobileNet models

CoReNet is a technique for joint multi-object 3D reconstruction from a single RGB image.

Learning recognition/segmentation models without end-to-end training. 40%-60% less GPU memory footprint. Same training time. Better performance.

Leveraging Two Types of Global Graph for Sequential Fashion Recommendation, ICMR 2021

A program that uses computer vision to detect hand gestures, used for controlling movie players.

Worktory is a python library created with the single purpose of simplifying the inventory management of network automation scripts.

AI-Bot - 一个基于watermelon改造的OpenAI-GPT-2的智能机器人

A pre-trained language model for social media text in Spanish

MMGeneration is a powerful toolkit for generative models, based on PyTorch and MMCV.

Implementation of average- and worst-case robust flatness measures for adversarial training.

A toolkit for document-level event extraction, containing some SOTA model implementations

Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)

Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning

Keras Implementation of The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation by (Simon Jégou, Michal Drozdzal, David Vazquez, Adriana Romero, Yoshua Bengio)

Experimental Python implementation of OpenVINO Inference Engine (very slow, limited functionality). All codes are written in Python. Easy to read and modify.

This repository contains code accompanying the paper "An End-to-End Chinese Text Normalization Model based on Rule-Guided Flat-Lattice Transformer"