(NeurIPS 2020) Wasserstein Distances for Stereo Disparity Estimation

Last update: Dec 12, 2022

Related tags

Deep Learning W-Stereo-Disp

Overview

Wasserstein Distances for Stereo Disparity Estimation

Accepted in NeurIPS 2020 as Spotlight. [Project Page]

by Divyansh Garg, Yan Wang, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger and Wei-Lun Chao

Citation

@inproceedings{div2020wstereo,
  title={Wasserstein Distances for Stereo Disparity Estimation},
  author={Garg, Divyansh and Wang, Yan and Hariharan, Bharath and Campbell, Mark and Weinberger, Kilian and Chao, Wei-Lun},
  booktitle={NeurIPS},
  year={2020}
}

Introduction

Existing approaches to depth or disparity estimation output a distribution over a set of pre-defined discrete values. This leads to inaccurate results when the true depth or disparity does not match any of these values. The fact that this distribution is usually learned indirectly through a regression loss causes further problems in ambiguous regions around object boundaries. We address these issues using a new neural network architecture that is capable of outputting arbitrary depth values, and a new loss function that is derived from the Wasserstein distance between the true and the predicted distributions. We validate our approach on a variety of tasks, including stereo disparity and depth estimation, and the downstream 3D object detection. Our approach drastically reduces the error in ambiguous regions, especially around object boundaries that greatly affect the localization of objects in 3D, achieving the state-of-the-art in 3D object detection for autonomous driving.

Our Wasserstein loss modification W_loss can be easily plugged in existing stereo depth models to improve the training and obtain better results.

We release the code for CDN-PSMNet and CDN-SDN models.

Requirements
Pretrained Models
Training and Inference

Requirements

Python 3.7
Pytorch 1.2.0+
CUDA
pip install -r ./requirements.txt
SceneFlow
KITTI

Pretrained Models

TO BE ADDED.

Datasets

You have to download the SceneFlow and KITTI datasets. The structures of the datasets are shown in below.

SceneFlow Dataset Structure

SceneFlow
    | monkaa
        | frames_cleanpass
        | disparity
    | driving
        | frames_cleanpass
        | disparity
    | flyingthings3d
        | frames_cleanpass 
        | disparity

KITTI Object Detection Dataset Structure

KITTI
    | training
        | calib
        | image_2
        | image_3
        | velodyne
    | testing
        | calib
        | image_2
        | image_3

Generate soft-links of SceneFlow Datasets. The results will be saved in ./sceneflow folder. Please change to fakepath path-to-SceneFlow to the SceneFlow dataset location before running the script.

python sceneflow.py --path path-to-SceneFlow --force

Convert the KITTI velodyne ground truths to depth maps. Please change to fakepath path-to-KITTI to the SceneFlow dataset location before running the script.

python ./src/preprocess/generate_depth_map.py --data_path path-to-KITTI/ --split_file ./split/trainval.txt

Optionally download KITTI2015 datasets for evaluating stereo disparity models.

Training and Inference

1 Train CDN from Scratch on SceneFlow Dataset
2 Train CDN on KITTI Dataset
3 Generate Predictions
4 Train 3D Detection with Pseudo-LiDAR

We have provided all pretrained models Pretrained Models. If you only want to generate the predictions, you can directly go to step 3.

The default setting requires four gpus to train. You can use smaller batch sizes which are btrain and bval, if you don't have enough gpus.

We provide code for both stereo disparity and stereo depth models.

1 Train CDN-SDN from Scratch on SceneFlow Dataset

python ./src/main_depth.py -c src/configs/sceneflow_w1.config

The checkpoints are saved in ./results/stack_sceneflow_w1/.

Follow same procedure to train stereo disparity model, but use src/main_disp.py and change to a disparity config.

2 Train CDN-SDN on KITTI Dataset

python ./src/main_depth.py -c src/configs/kitti_w1.config \
    --pretrain ./results/sceneflow_w1/checkpoint.pth.tar --dataset  path-to-KITTI/training/

Before running, please change the fakepath path-to-KITTI/ to the correct one. --pretrain is the path to the pretrained model on SceneFlow. The training results are saved in ./results/kitti_w1_train.

If you are working on evaluating CDN on KITTI testing set, you might want to train CDN on training+validation sets. The training results will be saved in ./results/sdn_kitti_trainval.

python ./src/main_depth.py -c src/configs/kitti_w1.config \
    --pretrain ./results/sceneflow_w1/checkpoint.pth.tar \
    --dataset  path-to-KITTI/training/ --split_train ./split/trainval.txt \
    --save_path ./results/sdn_kitti_trainval

The disparity models can also be trained on KITTI2015 datasets using src/kitti2015_w1_disp.config.

3 Generate Predictions

Please change the fakepath path-to-KITTI. Moreover, if you use the our provided checkpoint, please modify the value of --resume to the checkpoint location.

a. Using the model trained on KITTI training set, and generating predictions on training + validation sets.

python ./src/main_depth.py -c src/configs/kitti_w1.config \
    --resume ./results/sdn_kitti_train/checkpoint.pth.tar --datapath  path-to-KITTI/training/ \
    --data_list ./split/trainval.txt --generate_depth_map --data_tag trainval

The results will be saved in ./results/sdn_kitti_train/depth_maps_trainval/.

b. Using the model trained on KITTI training + validation set, and generating predictions on testing sets. You will use them when you want to submit your results to the leaderboard.

The results will be saved in ./results/sdn_kitti_trainval_set/depth_maps_trainval/.

# testing sets
python ./src/main_depth.py -c src/configs/kitti_w1.config \
    --resume ./results/sdn_kitti_trainval/checkpoint.pth.tar --datapath  path-to-KITTI/testing/ \
    --data_list=./split/test.txt --generate_depth_map --data_tag test

The results will be saved in ./results/sdn_kitti_trainval/depth_maps_test/.

4 Train 3D Detection with Pseudo-LiDAR

For training 3D object detection models, follow step 4 and after in the Pseudo-LiDAR_V2 repo https://github.com/mileyan/Pseudo_Lidar_V2.

Results

Results on the Stereo Disparity

3D Object Detection Results on KITTI leader board

Questions

Please feel free to email us if you have any questions.

Divyansh Garg [email protected] Yan Wang [email protected] Wei-Lun Chao [email protected]

(NeurIPS 2020) Wasserstein Distances for Stereo Disparity Estimation

Related tags

Overview

Wasserstein Distances for Stereo Disparity Estimation

Citation

Introduction

Contents

Requirements

Pretrained Models

Datasets

SceneFlow Dataset Structure

KITTI Object Detection Dataset Structure

Training and Inference

1 Train CDN-SDN from Scratch on SceneFlow Dataset

2 Train CDN-SDN on KITTI Dataset

3 Generate Predictions

4 Train 3D Detection with Pseudo-LiDAR

Results

Results on the Stereo Disparity

3D Object Detection Results on KITTI leader board

Questions

Owner

Divyansh Garg

Codes for AAAI 2022 paper: Context-aware Health Event Prediction via Transition Functions on Dynamic Disease Graphs

PyTorch implementation of our paper: Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

Rafael Project- Classifying rockets to different types using data science algorithms.

Pytorch ImageNet1k Loader with Bounding Boxes.

Pynomial - a lightweight python library for implementing the many confidence intervals for the risk parameter of a binomial model

Representing Long-Range Context for Graph Neural Networks with Global Attention

Pytorch Implementation of Adversarial Deep Network Embedding for Cross-Network Node Classification

TAug :: Time Series Data Augmentation using Deep Generative Models

A PyTorch Implementation of FaceBoxes

This is an easy python software which allows to sort images with faces by gender and after by age.

competitions-v2

CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP

Semi-supervised Adversarial Learning to Generate Photorealistic Face Images of New Identities from 3D Morphable Model

We will see a basic program that is basically a hint to brute force attack to crack passwords. In other words, we will make a program to Crack Any Password Using Python. Show some ❤️ by starring this repository!

Code samples for my book "Neural Networks and Deep Learning"

This is a simple framework to make object detection dataset very quickly

Ipython notebook presentations for getting starting with basic programming, statistics and machine learning techniques

Simple Tensorflow implementation of "Adaptive Convolutions for Structure-Aware Style Transfer" (CVPR 2021)

Scene-Text-Detection-and-Recognition (Pytorch)

U-Net Brain Tumor Segmentation