Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

Last update: Jan 02, 2023

Related tags

Overview

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

This repository contains an implementation of our CVPR2021 publication:

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging. S. Mahdi H. Miangoleh, Sebastian Dille, Long Mai, Sylvain Paris, Yağız Aksoy. Main pdf, Supplementary pdf, Project Page.

Change log:

Google Colaboratory notebook is now available [June 2021]
Merge net training dataset generation instructions is now available. [June 2021]
bug fix [June 2021]

Setup

We Provided the implementation of our method using MiDas-v2 and SGRnet as the base.

Environments

Our mergenet model is trained using torch 0.4.1 and python 3.6 and is tested with torch<=1.8.

Download our mergenet model weights from here and put it in

.\pix2pix\checkpoints\mergemodel\latest_net_G.pth

To use MiDas-v2 as base: Install dependancies as following:

conda install pytorch torchvision opencv cudatoolkit=10.2 -c pytorch
conda install matplotlib
conda install scipy
conda install scikit-image

Download the model weights from MiDas-v2 and put it in

./midas/model.pt

activate the environment
python run.py --Final --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet 0

To use SGRnet as base: Install dependancies as following:

conda install pytorch=0.4.1 cuda92 -c pytorch
conda install torchvision
conda install matplotlib
conda install scikit-image
pip install opencv-python

Follow the official SGRnet repository to compile the syncbn module in ./structuredrl/models/syncbn. Download the model weights from SGRnet and put it in

./structuredrl/model.pth.tar

activate the environment
python run.py --Final --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet 1

Different input arguments can be used to generate R0 and R20 results as discussed in the paper.

python run.py --R0 --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet #[0or1]
python run.py --R20 --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet #[0or1]

Evaluation

Fill in the needed variables in the following matlab file and run:

./evaluation/evaluatedataset.m

estimation_path : path to estimated disparity maps
gt_depth_path : path to gt depth/disparity maps
dataset_disp_gttype : (true) if ground truth data is disparity and (false) if gt depth data is depth.
evaluation_matfile_save_dir : directory to save the evalution results as .mat file.
superpixel_scale : scale parameter to run the superpixels on scaled version of the ground truth images to accelarate the evaluation. use 1 for small gt images.

Training

Navigate to dataset preparation instructions to download and prepare the training dataset.

python ./pix2pix/train.py --dataroot DATASETDIR --name mergemodeltrain --model pix2pix4depth --no_flip --no_dropout

python ./pix2pix/test.py --dataroot DATASETDIR --name mergemodeleval --model pix2pix4depth --no_flip --no_dropout

Citation

This implementation is provided for academic use only. Please cite our paper if you use this code or any of the models.

@INPROCEEDINGS{Miangoleh2021Boosting,
author={S. Mahdi H. Miangoleh and Sebastian Dille and Long Mai and Sylvain Paris and Ya\u{g}{\i}z Aksoy},
title={Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging},
journal={Proc. CVPR},
year={2021},
}

Credits

The "Merge model" code skeleton (./pix2pix folder) was adapted from the pytorch-CycleGAN-and-pix2pix repository.

For MiDaS and SGR inferences we used the scripts and models from MiDas-v2 and SGRnet respectively (./midas and ./structuredrl folders).

Thanks to k-washi for providing us with a Google Colaboratory notebook implementation.

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

Related tags

Overview

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

Change log:

Setup

Environments

Evaluation

Training

Citation

Credits

Owner

Computational Photography Lab @ SFU

Code for paper Decoupled Dynamic Spatial-Temporal Graph Neural Network for Traffic Forecasting

FedGS: A Federated Group Synchronization Framework Implemented by LEAF-MX.

End-to-End Referring Video Object Segmentation with Multimodal Transformers

The GitHub repository for the paper: “Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction“.

A High-Performance Distributed Library for Large-Scale Bundle Adjustment

KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control

Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology (LMRL Workshop, NeurIPS 2021)

Official Implementation of Swapping Autoencoder for Deep Image Manipulation (NeurIPS 2020)

An automated facial recognition based attendance system (desktop application)

Reinforcement learning framework and algorithms implemented in PyTorch.

Oscar and VinVL

A simple configurable bot for sending arXiv article alert by mail

A paper using optimal transport to solve the graph matching problem.

This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".

Moon-patrol - A faithful recreation of the 1983 hit classic Moon Patrol for the Atari 2600 created using the Pygame library for Python

Software associated to AAAI paper "Planning with Biological Neurons and Synapses"

Automates Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning :rocket:

Open source code for Paper "A Co-Interactive Transformer for Joint Slot Filling and Intent Detection"

Edge-oriented Convolution Block for Real-time Super Resolution on Mobile Devices, ACM Multimedia 2021

Accommodating supervised learning algorithms for the historical prices of the world's favorite cryptocurrency and boosting it through LightGBM.