PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility

Last update: Apr 19, 2022

Related tags

Deep Learning patchmatch-rl

Overview

PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility

Jae Yong Lee, Joseph DeGol, Chuhang Zou, Derek Hoiem

Installation

To install necessary python package for our work:

conda install pytorch torchvision numpy matplotlib pandas tqdm tensorboard cudatoolkit=11.1 -c pytorch -c conda-forge
pip install opencv-python tabulate moviepy openpyxl pyntcloud open3d==0.9 pytorch-lightning==1.4.9

To setup dataset for training for our work, please download:

BlendedMVS Dataset

To setup dataset for testing, please use:

ETH3D High-Res (PatchMatchNet pre-processed sets)
- NOTE: We use our own script to pre-process. We are currently preparing code for the script. We will post update once it is available.
Tanks and Temples (MVSNet pre-processed sets)

Training

To train out method:

python bin/train.py --experiment_name=EXPERIMENT_NAME \
                    --log_path=TENSORBOARD_LOG_PATH \
                    --checkpoint_path=CHECKPOINT_PATH \
                    --dataset_path=ROOT_PATH_TO_DATA \
                    --dataset={BlendedMVS,DTU} \
                    --resume=True # if want to resume training with the same experiment_name

Testing

To test our method, we need two scripts. First script to generate geometetry, and the second script to fuse the geometry. Geometry generation code:

python bin/generate.py --experiment_name=EXPERIMENT_USED_FOR_TRAINING \
                       --checkpoint_path=CHECKPOINT_PATH \
                       --epoch_id=EPOCH_ID \
                       --num_views=NUMBER_OF_VIEWS \
                       --dataset_path=ROOT_PATH_TO_DATA \
                       --output_path=PATH_TO_OUTPUT_GEOMETRY \
                       --width=(optional)WIDTH \
                       --height=(optional)HEIGHT \
                       --dataset={ETH3DHR, TanksAndTemples} \
                       --device=DEVICE

This will generate depths / normals / images into the folder specified by --output_path. To be more precise:

OUTPUT_PATH/
    EXPERIMENT_NAME/
        CHECKPOINT_FILE_NAME/
            SCENE_NAME/
                000000_camera.pth <-- contains intrinsics / extrinsics
                000000_depth_map.pth
                000000_normal_map.pth
                000000_meta.pth <-- contains src_image ids
                ...

Once the geometries are generated, we can use the fusion code to fuse them into point cloud: GPU Fusion code:

python bin/fuse_output.py --output_path=OUTPUT_PATH_USED_IN_GENERATE.py
                          --experiment_name=EXPERIMENT_NAME \
                          --epoch_id=EPOCH_ID \
                          --dataset=DATASET \
                          # fusion related args
                          --proj_th=PROJECTION_DISTANCE_THRESHOLD \
                          --dist_th=DISTANCE_THRESHOLD \
                          --angle_th=ANGLE_THRESHOLD \
                          --num_consistent=NUM_CONSITENT_IMAGES \
                          --target_width=(Optional) target image width for fusion \
                          --target_height=(Optional) target image height for fusion \
                          --device=DEVICE \

The target width / height are useful for fusing depth / normal after upsampling.

We also provide ETH3D testing script:

python bin/evaluate_eth3d.py --eth3d_binary_path=PATH_TO_BINARY_EXE \
                             --eth3d_gt_path=PATH_TO_GT_MLP_FOLDER \
                             --output_path=PATH_TO_FOLDER_WITH_POINTCLOUDS \
                             --experiment_name=NAME_OF_EXPERIMENT \
                             --epoch_id=EPOCH_OF_CHECKPOINT_TO_LOAD (default last.ckpt)

Resources

Citation

If you want to use our work in your project, please cite:

@InProceedings{lee2021patchmatchrl,
    author    = {Lee, Jae Yong and DeGol, Joseph and Zou, Chuhang and Hoiem, Derek},
    title     = {PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
    month     = {October},
    year      = {2021}
}

PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility

Related tags

Overview

PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility

Jae Yong Lee, Joseph DeGol, Chuhang Zou, Derek Hoiem

Installation

Training

Testing

Resources

Citation

Owner

Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows.

INSPIRED: A Transparent Dialogue Dataset for Interactive Semantic Parsing

TLDR: Twin Learning for Dimensionality Reduction

RATCHET is a Medical Transformer for Chest X-ray Diagnosis and Reporting

Official implementation of paper Gradient Matching for Domain Generalization

Event-forecasting - Event Forecasting Algorithms With Python

Code for CVPR2021 paper "Robust Reflection Removal with Reflection-free Flash-only Cues"

Deep functional residue identification

Ἀνατομή is a PyTorch library to analyze representation of neural networks

🏅 Top 5% in 제2회 연구개발특구 인공지능 경진대회 AI SPARK 챌린지

JittorVis - Visual understanding of deep learning models

4th place solution to datafactory challenge by Intermarché.

A full pipeline AutoML tool for tabular data

A simple Python library for stochastic graphical ecological models

The official code of "SCROLLS: Standardized CompaRison Over Long Language Sequences".

PyTorch implementation of a collections of scalable Video Transformer Benchmarks.

This repository contains the source code of Auto-Lambda and baselines from the paper, Auto-Lambda: Disentangling Dynamic Task Relationships.

Machine Learning in Asset Management (by @firmai)

Near-Duplicate Video Retrieval with Deep Metric Learning

[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning