Aerial Depth Completion

This work is described in the letter "Aerial Single-View Depth Completion with Image-Guided Uncertainty Estimation", by Lucas Teixeira, Martin R. Oswald, Marc Pollefeys, Margarita Chli, published in the IEEE Robotics and Automation Letters (RA-L / ICRA) ETHZ Library link.

Video:

Presentation:

Citations:

If you use this Code or Aerial Dataset, please cite the following publication:

@article{Teixeira:etal:RAL2020,
    title   = {{Aerial Single-View Depth Completion with Image-Guided Uncertainty Estimation}},
    author  = {Lucas Teixeira and Martin R. Oswald and Marc Pollefeys and Margarita Chli},
    journal = {{IEEE} Robotics and Automation Letters ({RA-L})},
    doi     = {10.1109/LRA.2020.2967296},
    year    = {2020}
}

NYUv2, CAB and PVS datasets require further citation from their authors. During our research, we reformat and created ground-truth depth for the CAB and PVS datasets. This code also contains thirt-party networks used for comparison. Please also cite their authors properly in case of use.

Acknowledgment:

The authors thank Fangchang Ma and Abdelrahman Eldesokey for sharing their code that is partially used here. The authors also thanks the owner of the 3D models used to build the dataset. They are identified in each 3D model file.

Data and Simulator

Trained Models

Several trained models are available - here.

Datasets

Aerial Dataset - link
NYUv2 Dataset - link (preprocessed by Fangchang Ma and originally from Silberman et al. ECCV12)
CAB Dataset - link (In this work, we created the depth information for the dataset originally published in Teixeira and Chli IROS16)
PVS Dataset - link (In this work, we created the depth information for the dataset originally published in Restrepo et al. P&RS14)

To be used together by our code, the datasets need to be merged, this means that the content of the train folder of each dataset need to be place in a single train folder. The same happens with the eval folder.

Simulator

The Aerial Dataset was created using this simulator link.

3D Models

Most of the 3D models used to create the dataset can be download here. In the license files contain the authors of the 3D models. Some models were extended with a satellite image from Google Earth.

Running the code

Prerequisites

PyTorch 1.0.1
Python 3.6
Plus dependencies

Testing Example

python3 main.py --evaluate "/media/lucas/lucas-ds2-1tb/tmp/model_best.pth.tar" --data-path "/media/lucas/lucas-ds2-1tb/dataset_big_v12"

Training Example

python3 main.py --data-path "/media/lucas/lucas-ds2-1tb/dataset_big_v12" --workers 8 -lr 0.00001 --batch-size 1 --dcnet-arch gudepthcompnet18 --training-mode dc1_only --criterion l2

python3 main.py --data-path "/media/lucas/lucas-ds2-1tb/dataset_big_v12" --workers 8 --criterion l2 --training-mode dc0-cf1-ln1 --dcnet-arch ged_depthcompnet --dcnet-pretrained /media/lucas/lucas-ds2-1tb/tmp/model_best.pth.tar:dc_weights --confnet-arch cbr3-c1 --confnet-pretrained /media/lucas/lucas-ds2-1tb/tmp/model_best.pth.tar:conf_weights --lossnet-arch ged_depthcompnet --lossnet-pretrained /media/lucas/lucas-ds2-1tb/tmp/model_best.pth.tar:lossdc_weights

Parameters

Parameter	Description
--help	show this help message and exit
--output NAME	output base name in the subfolder results
--training-mode ARCH	this variable indicating the training mode. Our framework has up to tree parts the dc (depth completion net), the cf (confidence estimation net) and the ln (loss net). The number 0 or 1 indicates whether the network should be updated during the back-propagation. All the networks can be pre-load using other parameters. training_mode: dc1_only ; dc1-ln0 ; dc1-ln1 ; dc0-cf1-ln0 ; dc1-cf1-ln0 ; dc0-cf1-ln1 ; dc1-cf1-ln1 (default: dc1_only)
--dcnet-arch ARCH	model architecture: resnet18 ; udepthcompnet18 ; gms_depthcompnet ; ged_depthcompnet ; gudepthcompnet18 (default: resnet18)
--dcnet-pretrained PATH	path to pretraining checkpoint for the dc net (default: empty). Each checkpoint can have multiple network. So it is necessary to define each one. the format is path:network_name. network_name can be: dc_weights, conf_weights, lossdc_weights.
--dcnet-modality MODALITY	modality: rgb ; rgbd ; rgbdw (default: rgbd)
--confnet-arch ARCH	model architecture: cbr3-c1 ; cbr3-cbr1-c1 ; cbr3-cbr1-c1res ; join ; none (default: cbr3-c1)
--confnet-pretrained PATH	path to pretraining checkpoint for the cf net (default: empty). Each checkpoint can have multiple network. So it is necessary to define each one. the format is path:network_name. network_name can be: dc_weights, conf_weights, lossdc_weights.
--lossnet-arch ARCH	model architecture: resnet18 ; udepthcompnet18 (uresnet18) ; gms_depthcompnet (nconv-ms) ; ged_depthcompnet (nconv-ed) ; gudepthcompnet18 (nconv-uresnet18) (default: ged_depthcompnet)
--lossnet-pretrained PATH	path to pretraining checkpoint for the ln net (default: empty). Each checkpoint can have multiple network. So it is necessary to define each one. the format is path:network_name. network_name can be: dc_weights, conf_weights, lossdc_weights.
--data-type DATA	dataset: visim ; kitti (default: visim)
--data-path PATH	path to data folder - this folder has to have inside a val folder and a train folder if it is not in evaluation mode.
--data-modality MODALITY	this field define the input modality in the format colour-depth-weight. kfd and fd mean random sampling in the ground-truth. kgt means keypoints from slam with depth from ground-truth. kor means keypoints from SLAM with depth from the landmark. The weight can be binary (bin) or from the uncertanty from slam (kw). The parameter can be one of the following: rgb-fd-bin ; rgb-kfd-bin ; rgb-kgt-bin ; rgb-kor-bin ; rgb-kor-kw (default: rgb-fd-bin)
--workers N	number of data loading workers (default: 10)
--epochs N	number of total epochs to run (default: 15)
--max-gt-depth D	cut-off depth of ground truth, negative values means infinity (default: inf [m])
--min-depth D	cut-off depth of sparsifier (default: 0 [m])
--max-depth D	cut-off depth of sparsifier, negative values means infinity (default: inf [m])
--divider D	Normalization factor - zero means per frame (default: 0 [m])
--num-samples N	number of sparse depth samples (default: 500)
--sparsifier SPARSIFIER	sparsifier: uar ; sim_stereo (default: uar)
--criterion LOSS	loss function: l1 ; l2 ; il1 (inverted L1) ; absrel (default: l1)
--optimizer OPTIMIZER	Optimizer: sgd ; adam (default: adam)
--batch-size BATCH_SIZE	mini-batch size (default: 8)
--learning-rate LR	initial learning rate (default 0.001)
--learning-rate-step LRS	number of epochs between reduce the learning rate by 10 (default: 5)
--learning-rate-multiplicator LRM	multiplicator (default 0.1)
--momentum M	momentum (default: 0)
--weight-decay W	weight decay (default: 0)
--val-images N	number of images in the validation image (default: 10)
--print-freq N	print frequency (default: 10)
--resume PATH	path to latest checkpoint (default: empty)
--evaluate PATH	evaluates the model on validation set, all the training parameters will be ignored, but the input parameters still matters (default: empty)
--precision-recall	enables the calculation of precision recall table, might be necessary to ajust the bin and top values in the ConfidencePixelwiseThrAverageMeter class. The result table shows for each confidence threshold the error and the density (default:false)
--confidence-threshold VALUE	confidence threshold , the best way to select this number is create the precision-recall table. (default: 0)

Contact

In case of any issue, fell free to contact me via email lteixeira at mavt.ethz.ch.

Aerial Single-View Depth Completion with Image-Guided Uncertainty Estimation (RA-L/ICRA 2020)

Related tags

Overview

Aerial Depth Completion

Video:

Presentation:

Citations:

Acknowledgment:

Data and Simulator

Trained Models

Datasets

Simulator

3D Models

Running the code

Prerequisites

Testing Example

Training Example

Parameters

Contact

Owner

ETHZ V4RL

FAST-RIR: FAST NEURAL DIFFUSE ROOM IMPULSE RESPONSE GENERATOR

An implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.

Official PyTorch implementation of "Physics-aware Difference Graph Networks for Sparsely-Observed Dynamics".

A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.

这是一个mobilenet-yolov4-lite的库，把yolov4主干网络修改成了mobilenet，修改了Panet的卷积组成，使参数量大幅度缩小。

You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks.

Easy genetic ancestry predictions in Python

Pytorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper

PyTorch implementation of popular datasets and models in remote sensing

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

This repository contains code for the paper "Disentangling Label Distribution for Long-tailed Visual Recognition", published at CVPR' 2021

Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.

Supplementary code for the paper "Meta-Solver for Neural Ordinary Differential Equations" https://arxiv.org/abs/2103.08561

Official Code for "Constrained Mean Shift Using Distant Yet Related Neighbors for Representation Learning"

The codes reproduce the figures and statistics in the paper, "Controlling for multiple covariates," by Mark Tygert.

This is just a funny project that we want to see AutoEncoder (AE) can actually work to enhance the features we want

PyTorch implementation of MLP-Mixer

LightSeq is a high performance training and inference library for sequence processing and generation implemented in CUDA

Wanli Li and Tieyun Qian: Exploit a Multi-head Reference Graph for Semi-supervised Relation Extraction, IJCNN 2021