FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks

Overview

FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks

This repository contains the code (in PyTorch) for the "FADNet++" paper.

Contents

  1. Introduction
  2. Usage
  3. Results
  4. Acknowledgement
  5. Contacts

Introduction

We propose an efficient and accurate deep network for disparity estimation named FADNet with three main features:

  • It exploits efficient 2D based correlation layers with stacked blocks to preserve fast computation.
  • It combines the residual structures to make the deeper model easier to learn.
  • It contains multi-scale predictions so as to exploit a multi-scale weight scheduling training technique to improve the accuracy.

Usage

Dependencies

Package Installation

  • Execute "sh compile.sh" to compile libraries needed by GANet.
  • Enter "layers_package" and execute "sh install.sh" to install customized layers, including Channel Normalization layer and Resample layer.

We also release the docker version of this project, which has been configured completely and can be used directly. Please refer to this website for the image.

Usage of Scene Flow dataset
Download RGB cleanpass images and its disparity for three subset: FlyingThings3D, Driving, and Monkaa. Organize them as follows:
- FlyingThings3D_release/frames_cleanpass
- FlyingThings3D_release/disparity
- driving_release/frames_cleanpass
- driving_release/disparity
- monkaa_release/frames_cleanpass
- monkaa_release/disparity
Put them in the data/ folder (or soft link). The *train.sh* defaultly locates the data root path as data/.

Train

We use template scripts to configure the training task, which are stored in exp_configs. One sample "fadnet.conf" is as follows:

net=fadnet
loss=loss_configs/fadnet_sceneflow.json
outf_model=models/${net}-sceneflow
logf=logs/${net}-sceneflow.log

lr=1e-4
devices=0,1,2,3
dataset=sceneflow
trainlist=lists/SceneFlow.list
vallist=lists/FlyingThings3D_release_TEST.list
startR=0
startE=0
batchSize=16
maxdisp=-1
model=none
#model=fadnet_sceneflow.pth
Parameter Description Options
net network architecture name dispnets, dispnetc, dispnetcss, fadnet, psmnet, ganet
loss loss weight scheduling configuration file depends on the training scheme
outf_model folder name to store the model files \
logf log file name \
lr initial learning rate \
devices GPU device IDs to use depends on the hardware system
dataset dataset name to train sceneflow
train(val)list sample lists for training/validation \
startR the round index to start training (for restarting training from the checkpoint) \
startE the epoch index to start training (for restarting training from the checkpoint) \
batchSize the number of samples per batch \
maxdisp the maximum disparity that the model tries to predict \
model the model file path of the checkpoint \

We have integrated PSMNet and GANet for comparison. The sample configuration files are also given.

To start training, use the following command, dnn=CONFIG_FILE sh train.sh, such as:

dnn=fadnet sh train.sh

You do not need the suffix for CONFIG_FILE.

Evaluation

We have two modes for performance evaluation, test and detect, respectively. test requires that the testing samples should have ground truth of disparity and then reports the average End-point-error (EPE). detect does not require any ground truth for EPE computation. However, detect stores the disparity maps for each sample in the given list.

For the test mode, one can revise test.sh and run sh test.sh. The contents of test.sh are as follows:

net=fadnet
maxdisp=-1
dataset=sceneflow
trainlist=lists/SceneFlow.list
vallist=lists/FlyingThings3D_release_TEST.list

loss=loss_configs/test.json
outf_model=models/test/
logf=logs/${net}_test_on_${dataset}.log

lr=1e-4
devices=0,1,2,3
startR=0
startE=0
batchSize=8
model=models/fadnet.pth
python main.py --cuda --net $net --loss $loss --lr $lr \
               --outf $outf_model --logFile $logf \
               --devices $devices --batch_size $batchSize \
               --trainlist $trainlist --vallist $vallist \
               --dataset $dataset --maxdisp $maxdisp \
               --startRound $startR --startEpoch $startE \
               --model $model 

Most of the parameters in test.sh are similar to training. However, you can just ignore parameters, including trainlist, loss, outf_model, since they are not used in the test mode.

For the detect mode, one can revise detect.sh and run sh detect.sh. The contents of detect.sh are as follows:

net=fadnet
dataset=sceneflow

model=models/fadnet.pth
outf=detect_results/${net}-${dataset}/

filelist=lists/FlyingThings3D_release_TEST.list
filepath=data

CUDA_VISIBLE_DEVICES=0 python detecter.py --model $model --rp $outf --filelist $filelist --filepath $filepath --devices 0 --net ${net} 

You can revise the value of outf to change the folder that stores the predicted disparity maps.

Finetuning on KITTI datasets and result submission

We re-use the codes in PSMNet to finetune the pretrained models on KITTI datasets and generate disparity maps for submission. Use finetune.sh and submission.sh to do them respectively.

Pretrained Model

Update: 2020/2/6 We released the pre-trained Scene Flow model.

KITTI 2015 Scene Flow KITTI 2012
/ Google Drive /

Results

Results on Scene Flow dataset

Model EPE GPU Memory during inference (GB) Runtime (ms) on Tesla V100
FADNet 0.83 3.87 48.1
DispNetC 1.68 1.62 18.7
PSMNet 1.09 13.99 399.3
GANet 0.84 29.1 2251.1

Citation

If you find the code and paper is useful in your work, please cite our conference paper

@inproceedings{wang2020fadnet,
  title={{FADNet}: A Fast and Accurate Network for Disparity Estimation},
  author={Wang, Qiang and Shi, Shaohuai and Zheng, Shizhen and Zhao, Kaiyong and Chu, Xiaowen},
  booktitle={2020 {IEEE} International Conference on Robotics and Automation ({ICRA} 2020)},
  pages={101--107},
  year={2020}
}

Acknowledgement

We acknowledge the following repositories and papers since our project has used some codes of them.

Contacts

[email protected]

Any discussions or concerns are welcomed!

Owner
HKBU High Performance Machine Learning Lab
HKBU High Performance Machine Learning Lab
object recognition with machine learning on Respberry pi

Respberrypi_object-recognition object recognition with machine learning on Respberry pi line.py 建立一支與樹梅派連線的 linebot 使用此 linebot 遠端控制樹梅派拍照 config.ini l

1 Dec 11, 2021
Explainability of the Implications of Supervised and Unsupervised Face Image Quality Estimations Through Activation Map Variation Analyses in Face Recognition Models

Explainable_FIQA_WITH_AMVA Note This is the official repository of the paper: Explainability of the Implications of Supervised and Unsupervised Face I

3 May 08, 2022
PyTorch version implementation of DORN

DORN_PyTorch This is a PyTorch version implementation of DORN Reference H. Fu, M. Gong, C. Wang, K. Batmanghelich and D. Tao: Deep Ordinal Regression

Zilin.Zhang 3 Apr 27, 2022
Python code for the paper How to scale hyperparameters for quickshift image segmentation

How to scale hyperparameters for quickshift image segmentation Python code for the paper How to scale hyperparameters for quickshift image segmentatio

0 Jan 25, 2022
MetaTTE: a Meta-Learning Based Travel Time Estimation Model for Multi-city Scenarios

MetaTTE: a Meta-Learning Based Travel Time Estimation Model for Multi-city Scenarios This is the official TensorFlow implementation of MetaTTE in the

morningstarwang 4 Dec 14, 2022
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers

TubeDETR: Spatio-Temporal Video Grounding with Transformers Website • STVG Demo • Paper This repository provides the code for our paper. This includes

Antoine Yang 108 Dec 27, 2022
Exploring Visual Engagement Signals for Representation Learning

Exploring Visual Engagement Signals for Representation Learning Menglin Jia, Zuxuan Wu, Austin Reiter, Claire Cardie, Serge Belongie and Ser-Nam Lim C

Menglin Jia 9 Jul 23, 2022
Minimalistic PyTorch training loop

Backbone for PyTorch training loop Will try to keep it minimalistic. pip install back from back import Bone Features Progress bar Checkpoints saving/l

Kashin 4 Jan 16, 2020
A curated list of neural rendering resources.

Awesome-of-Neural-Rendering A curated list of neural rendering and related resources. Please feel free to pull requests or open an issue to add papers

Zhiwei ZHANG 43 Dec 09, 2022
Robot Hacking Manual (RHM). From robotics to cybersecurity. Papers, notes and writeups from a journey into robot cybersecurity.

RHM: Robot Hacking Manual Download in PDF RHM v0.4 ┃ Read online The Robot Hacking Manual (RHM) is an introductory series about cybersecurity for robo

Víctor Mayoral Vilches 233 Dec 30, 2022
This is the official source code of "BiCAT: Bi-Chronological Augmentation of Transformer for Sequential Recommendation".

BiCAT This is our TensorFlow implementation for the paper: "BiCAT: Sequential Recommendation with Bidirectional Chronological Augmentation of Transfor

John 15 Dec 06, 2022
PyTorch implementation of ''Background Activation Suppression for Weakly Supervised Object Localization''.

Background Activation Suppression for Weakly Supervised Object Localization PyTorch implementation of ''Background Activation Suppression for Weakly S

35 Jan 06, 2023
Calling Julia from Python - an experiment on data loading

Calling Julia from Python - an experiment on data loading See the slides. TLDR After reading Patrick's blog post, we decided to try to replace C++ wit

Abel Siqueira 8 Jun 07, 2022
Active learning for Mask R-CNN in Detectron2

MaskAL - Active learning for Mask R-CNN in Detectron2 Summary MaskAL is an active learning framework that automatically selects the most-informative i

49 Dec 20, 2022
2021-AIAC-QQ-Browser-Hyperparameter-Optimization-Rank6

2021-AIAC-QQ-Browser-Hyperparameter-Optimization-Rank6

Aigege 8 Mar 31, 2022
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge This is an implementation of the paper,

Mutian He 19 Oct 14, 2022
Deep Halftoning with Reversible Binary Pattern

Deep Halftoning with Reversible Binary Pattern ICCV Paper | Project Website | BibTex Overview Existing halftoning algorithms usually drop colors and f

Menghan Xia 17 Nov 22, 2022
Multiple-Object Tracking with Transformer

TransTrack: Multiple-Object Tracking with Transformer Introduction TransTrack: Multiple-Object Tracking with Transformer Models Training data Training

Peize Sun 537 Jan 04, 2023
Code and experiments for "Deep Neural Networks for Rank Consistent Ordinal Regression based on Conditional Probabilities"

corn-ordinal-neuralnet This repository contains the orginal model code and experiment logs for the paper "Deep Neural Networks for Rank Consistent Ord

Raschka Research Group 14 Dec 27, 2022
MAUS: A Dataset for Mental Workload Assessment Using Wearable Sensor - Baseline system

MAUS: A Dataset for Mental Workload Assessment Using Wearable Sensor - Baseline system Getting started To start working on this assignment, you should

2 Aug 06, 2022