PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility

Last update: Apr 19, 2022

Related tags

Deep Learning patchmatch-rl

Overview

PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility

Jae Yong Lee, Joseph DeGol, Chuhang Zou, Derek Hoiem

Installation

To install necessary python package for our work:

conda install pytorch torchvision numpy matplotlib pandas tqdm tensorboard cudatoolkit=11.1 -c pytorch -c conda-forge
pip install opencv-python tabulate moviepy openpyxl pyntcloud open3d==0.9 pytorch-lightning==1.4.9

To setup dataset for training for our work, please download:

BlendedMVS Dataset

To setup dataset for testing, please use:

ETH3D High-Res (PatchMatchNet pre-processed sets)
- NOTE: We use our own script to pre-process. We are currently preparing code for the script. We will post update once it is available.
Tanks and Temples (MVSNet pre-processed sets)

Training

To train out method:

python bin/train.py --experiment_name=EXPERIMENT_NAME \
                    --log_path=TENSORBOARD_LOG_PATH \
                    --checkpoint_path=CHECKPOINT_PATH \
                    --dataset_path=ROOT_PATH_TO_DATA \
                    --dataset={BlendedMVS,DTU} \
                    --resume=True # if want to resume training with the same experiment_name

Testing

To test our method, we need two scripts. First script to generate geometetry, and the second script to fuse the geometry. Geometry generation code:

python bin/generate.py --experiment_name=EXPERIMENT_USED_FOR_TRAINING \
                       --checkpoint_path=CHECKPOINT_PATH \
                       --epoch_id=EPOCH_ID \
                       --num_views=NUMBER_OF_VIEWS \
                       --dataset_path=ROOT_PATH_TO_DATA \
                       --output_path=PATH_TO_OUTPUT_GEOMETRY \
                       --width=(optional)WIDTH \
                       --height=(optional)HEIGHT \
                       --dataset={ETH3DHR, TanksAndTemples} \
                       --device=DEVICE

This will generate depths / normals / images into the folder specified by --output_path. To be more precise:

OUTPUT_PATH/
    EXPERIMENT_NAME/
        CHECKPOINT_FILE_NAME/
            SCENE_NAME/
                000000_camera.pth <-- contains intrinsics / extrinsics
                000000_depth_map.pth
                000000_normal_map.pth
                000000_meta.pth <-- contains src_image ids
                ...

Once the geometries are generated, we can use the fusion code to fuse them into point cloud: GPU Fusion code:

python bin/fuse_output.py --output_path=OUTPUT_PATH_USED_IN_GENERATE.py
                          --experiment_name=EXPERIMENT_NAME \
                          --epoch_id=EPOCH_ID \
                          --dataset=DATASET \
                          # fusion related args
                          --proj_th=PROJECTION_DISTANCE_THRESHOLD \
                          --dist_th=DISTANCE_THRESHOLD \
                          --angle_th=ANGLE_THRESHOLD \
                          --num_consistent=NUM_CONSITENT_IMAGES \
                          --target_width=(Optional) target image width for fusion \
                          --target_height=(Optional) target image height for fusion \
                          --device=DEVICE \

The target width / height are useful for fusing depth / normal after upsampling.

We also provide ETH3D testing script:

python bin/evaluate_eth3d.py --eth3d_binary_path=PATH_TO_BINARY_EXE \
                             --eth3d_gt_path=PATH_TO_GT_MLP_FOLDER \
                             --output_path=PATH_TO_FOLDER_WITH_POINTCLOUDS \
                             --experiment_name=NAME_OF_EXPERIMENT \
                             --epoch_id=EPOCH_OF_CHECKPOINT_TO_LOAD (default last.ckpt)

Resources

Citation

If you want to use our work in your project, please cite:

@InProceedings{lee2021patchmatchrl,
    author    = {Lee, Jae Yong and DeGol, Joseph and Zou, Chuhang and Hoiem, Derek},
    title     = {PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
    month     = {October},
    year      = {2021}
}

PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility

Related tags

Overview

PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility

Jae Yong Lee, Joseph DeGol, Chuhang Zou, Derek Hoiem

Installation

Training

Testing

Resources

Citation

Owner

Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels

Curating a dataset for bioimage transfer learning

Evolution Strategies in PyTorch

N-Omniglot is a large neuromorphic few-shot learning dataset

Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

[CVPR2021 Oral] FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation.

This is the official implementation of Elaborative Rehearsal for Zero-shot Action Recognition (ICCV2021)

This code is 3d-CNN model that can predict environmental value

Official PyTorch implementation of MAAD: A Model and Dataset for Attended Awareness

Sequence modeling benchmarks and temporal convolutional networks

Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning

EfficientNetV2-with-TPU - Cifar-10 case study

《Lerning n Intrinsic Grment Spce for Interctive Authoring of Grment Animtion》

Code image classification of MNIST dataset using different architectures: simple linear NN, autoencoder, and highway network

(AAAI 2021) Progressive One-shot Human Parsing

Attentive Implicit Representation Networks (AIR-Nets)

Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection

A PyTorch implementation of " EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks."

Code release for paper: The Boombox: Visual Reconstruction from Acoustic Vibrations

SGoLAM - Simultaneous Goal Localization and Mapping