A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo

Last update: Nov 17, 2022

Related tags

Deep Learning idn-solver

Overview

idn-solver

Paper | Project Page

This repository contains the code release of our ICCV 2021 paper:

A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo

Wang Zhao*, Shaohui Liu*, Yi Wei, Hengkai Guo, Yong-Jin Liu

Installation

We recommend to use conda to setup a specified environment. Run

conda env create -f environment.yml

Test on a sequence

First download the pretrained model from here and put it under ./pretrain/ folder.

Prepare the sequence data with color images, camera poses (4x4 cam2world transformation) and intrinsics. The sequence data structure should be like:

sequence_name
  | color
      | 00000.jpg
  | pose
      | 00000.txt
  | K.txt

Run the following command to get the outputs:

python infer_folder.py --seq_dir /path/to/the/sequence/data --output_dir /path/to/save/outputs --config ./configs/test_folder.yaml

Tune the "reference gap" parameter to make sure there are sufficient overlaps and camera translations within an image pair. For ScanNet-like sequence, we recommend to use reference_gap of 20.

Test on ScanNet

Prepare ScanNet test split data

Download the ScanNet test split data from the official site and pre-process the data using:

python ./data/preprocess.py --data_dir /path/to/scannet/test/split/ --output_dir /path/to/save/pre-processed/scannet/test/data

This includes 1. resize the color images to 480x640 resolution 2. sample the data with interval of 20

Run evaluation

python eval_scannet.py --data_dir /path/to/processed/scannet/test/split/ --config ./configs/test_scannet.yaml

Train

Prepare ScanNet training data

We use the pre-processed ScanNet data from NAS, you could download the data using this link. The data structure is like:

scannet
  | scannet_nas
    | train
      | scene0000_00
          | color
            | 0000.jpg
          | pose
            | 0000.txt
          | depth
            | 0000.npy
          | intrinsic
          | normal
            | 0000_normal.npy
    | val
  | scans_test_sample (preprocessed ScanNet test split)

Run training

Modify the "dataset_path" variable with yours in the config yaml.

The network is trained with a two-stage strategy. The whole training process takes ~6 days with 4 Nvidia V100 GPUs.

python train.py ./configs/scannet_stage1.yaml
python train.py ./configs/scannet_stage2.yaml

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{Zhao_2021_ICCV,
    author    = {Zhao, Wang and Liu, Shaohui and Wei, Yi and Guo, Hengkai and Liu, Yong-Jin},
    title     = {A Confidence-Based Iterative Solver of Depths and Surface Normals for Deep Multi-View Stereo},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {6168-6177}
}

Acknowledgement

This project heavily relies codes from NAS and we thank the authors for releasing their code.

We also thank Xiaoxiao Long for kindly helping with ScanNet evaluations.

A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo

Related tags

Overview

idn-solver

Installation

Test on a sequence

Test on ScanNet

Prepare ScanNet test split data

Run evaluation

Train

Prepare ScanNet training data

Run training

Citation

Acknowledgement

Owner

zhaowang

(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain

PyTorch code for the ICCV'21 paper: "Always Be Dreaming: A New Approach for Class-Incremental Learning"

Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV

TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers.

Rotary Transformer

An official implementation of the Anchor DETR.

A High-Performance Distributed Library for Large-Scale Bundle Adjustment

Nightmare-Writeup - Writeup for the Nightmare CTF Challenge from 2022 DiceCTF

A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity

Pixel-Perfect Structure-from-Motion with Featuremetric Refinement (ICCV 2021, Oral)

masscan + nmap + Finger

Multi-Scale Progressive Fusion Network for Single Image Deraining

This repository contains the files for running the Patchify GUI.

Official PyTorch implementation of the paper Image-Based CLIP-Guided Essence Transfer.

An API-first distributed deployment system of deep learning models using timeseries data to analyze and predict systems behaviour

A list of Machine Learning Art Colabs

Get started with Machine Learning with Python - An introduction with Python programming examples

Compute execution plan: A DAG representation of work that you want to get done. Individual nodes of the DAG could be simple python or shell tasks or complex deeply nested parallel branches or embedded DAGs themselves.

Running Google MoveNet Multipose Tracking models on OpenVINO.