PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility

Overview

PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility

Jae Yong Lee, Joseph DeGol, Chuhang Zou, Derek Hoiem

Installation

To install necessary python package for our work:

conda install pytorch torchvision numpy matplotlib pandas tqdm tensorboard cudatoolkit=11.1 -c pytorch -c conda-forge
pip install opencv-python tabulate moviepy openpyxl pyntcloud open3d==0.9 pytorch-lightning==1.4.9

To setup dataset for training for our work, please download:

To setup dataset for testing, please use:

  • ETH3D High-Res (PatchMatchNet pre-processed sets)
    • NOTE: We use our own script to pre-process. We are currently preparing code for the script. We will post update once it is available.
  • Tanks and Temples (MVSNet pre-processed sets)

Training

To train out method:

python bin/train.py --experiment_name=EXPERIMENT_NAME \
                    --log_path=TENSORBOARD_LOG_PATH \
                    --checkpoint_path=CHECKPOINT_PATH \
                    --dataset_path=ROOT_PATH_TO_DATA \
                    --dataset={BlendedMVS,DTU} \
                    --resume=True # if want to resume training with the same experiment_name

Testing

To test our method, we need two scripts. First script to generate geometetry, and the second script to fuse the geometry. Geometry generation code:

python bin/generate.py --experiment_name=EXPERIMENT_USED_FOR_TRAINING \
                       --checkpoint_path=CHECKPOINT_PATH \
                       --epoch_id=EPOCH_ID \
                       --num_views=NUMBER_OF_VIEWS \
                       --dataset_path=ROOT_PATH_TO_DATA \
                       --output_path=PATH_TO_OUTPUT_GEOMETRY \
                       --width=(optional)WIDTH \
                       --height=(optional)HEIGHT \
                       --dataset={ETH3DHR, TanksAndTemples} \
                       --device=DEVICE

This will generate depths / normals / images into the folder specified by --output_path. To be more precise:

OUTPUT_PATH/
    EXPERIMENT_NAME/
        CHECKPOINT_FILE_NAME/
            SCENE_NAME/
                000000_camera.pth <-- contains intrinsics / extrinsics
                000000_depth_map.pth
                000000_normal_map.pth
                000000_meta.pth <-- contains src_image ids
                ...

Once the geometries are generated, we can use the fusion code to fuse them into point cloud: GPU Fusion code:

python bin/fuse_output.py --output_path=OUTPUT_PATH_USED_IN_GENERATE.py
                          --experiment_name=EXPERIMENT_NAME \
                          --epoch_id=EPOCH_ID \
                          --dataset=DATASET \
                          # fusion related args
                          --proj_th=PROJECTION_DISTANCE_THRESHOLD \
                          --dist_th=DISTANCE_THRESHOLD \
                          --angle_th=ANGLE_THRESHOLD \
                          --num_consistent=NUM_CONSITENT_IMAGES \
                          --target_width=(Optional) target image width for fusion \
                          --target_height=(Optional) target image height for fusion \
                          --device=DEVICE \

The target width / height are useful for fusing depth / normal after upsampling.

We also provide ETH3D testing script:

python bin/evaluate_eth3d.py --eth3d_binary_path=PATH_TO_BINARY_EXE \
                             --eth3d_gt_path=PATH_TO_GT_MLP_FOLDER \
                             --output_path=PATH_TO_FOLDER_WITH_POINTCLOUDS \
                             --experiment_name=NAME_OF_EXPERIMENT \
                             --epoch_id=EPOCH_OF_CHECKPOINT_TO_LOAD (default last.ckpt)

Resources

Citation

If you want to use our work in your project, please cite:

@InProceedings{lee2021patchmatchrl,
    author    = {Lee, Jae Yong and DeGol, Joseph and Zou, Chuhang and Hoiem, Derek},
    title     = {PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
    month     = {October},
    year      = {2021}
}
DABO: Data Augmentation with Bilevel Optimization

DABO: Data Augmentation with Bilevel Optimization [Paper] The goal is to automatically learn an efficient data augmentation regime for image classific

ElementAI 24 Aug 12, 2022
Multi-Anchor Active Domain Adaptation for Semantic Segmentation (ICCV 2021 Oral)

Multi-Anchor Active Domain Adaptation for Semantic Segmentation Munan Ning*, Donghuan Lu*, Dong Wei†, Cheng Bian, Chenglang Yuan, Shuang Yu, Kai Ma, Y

Munan Ning 36 Dec 07, 2022
Style-based Neural Drum Synthesis with GAN inversion

Style-based Drum Synthesis with GAN Inversion Demo TensorFlow implementation of a style-based version of the adversarial drum synth (ADS) from the pap

Sound and Music Analysis (SoMA) Group 29 Nov 19, 2022
A PyTorch Implementation of SphereFace.

SphereFace A PyTorch Implementation of SphereFace. The code can be trained on CASIA-Webface and the best accuracy on LFW is 99.22%. SphereFace: Deep H

carwin 685 Dec 09, 2022
Office source code of paper UniFuse: Unidirectional Fusion for 360$^\circ$ Panorama Depth Estimation

UniFuse (RAL+ICRA2021) Office source code of paper UniFuse: Unidirectional Fusion for 360$^\circ$ Panorama Depth Estimation, arXiv, Demo Preparation I

Alibaba 47 Dec 26, 2022
Dense Unsupervised Learning for Video Segmentation (NeurIPS*2021)

Dense Unsupervised Learning for Video Segmentation This repository contains the official implementation of our paper: Dense Unsupervised Learning for

Visual Inference Lab @TU Darmstadt 173 Dec 26, 2022
Dataset and Code for the paper "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021), and "Depth-only Object Tracking" (BMVC2021)

DeT and DOT Code and datasets for "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021) "Depth-only Object Tracking" (BMVC2021) @InProceedings

Yan Song 55 Dec 15, 2022
Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [CVPR 2021]

Robust Instance Segmentation through Reasoning about Multi-Object Occlusion [CVPR 2021] Abstract Analyzing complex scenes with DNN is a challenging ta

Irene Yuan 24 Jun 27, 2022
QHack—the quantum machine learning hackathon

Official repo for QHack—the quantum machine learning hackathon

Xanadu 72 Dec 21, 2022
Pytorch implementation of TailCalibX : Feature Generation for Long-tail Classification

TailCalibX : Feature Generation for Long-tail Classification by Rahul Vigneswaran, Marc T. Law, Vineeth N. Balasubramanian, Makarand Tapaswi [arXiv] [

Rahul Vigneswaran 34 Jan 02, 2023
OCRA (Object-Centric Recurrent Attention) source code

OCRA (Object-Centric Recurrent Attention) source code Hossein Adeli and Seoyoung Ahn Please cite this article if you find this repository useful: For

Hossein Adeli 2 Jun 18, 2022
This is the repository of the NeurIPS 2021 paper "Curriculum Disentangled Recommendation withNoisy Multi-feedback"

Curriculum_disentangled_recommendation This is the repository of the NeurIPS 2021 paper "Curriculum Disentangled Recommendation with Noisy Multi-feedb

14 Dec 20, 2022
Gradient-free global optimization algorithm for multidimensional functions based on the low rank tensor train format

ttopt Description Gradient-free global optimization algorithm for multidimensional functions based on the low rank tensor train (TT) format and maximu

5 May 23, 2022
This repository contains the code for our paper VDA (public in EMNLP2021 main conference)

Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models This repository contains the code for our paper VDA (publ

RUCAIBox 13 Aug 06, 2022
A human-readable PyTorch implementation of "Self-attention Does Not Need O(n^2) Memory"

memory_efficient_attention.pytorch A human-readable PyTorch implementation of "Self-attention Does Not Need O(n^2) Memory" (Rabe&Staats'21). def effic

Ryuichiro Hataya 7 Dec 26, 2022
Survival analysis in Python

What is survival analysis and why should I learn it? Survival analysis was originally developed and applied heavily by the actuarial and medical commu

Cameron Davidson-Pilon 2k Jan 08, 2023
Time should be taken seer-iously

TimeSeers seers - (Noun) plural form of seer - A person who foretells future events by or as if by supernatural means TimeSeers is an hierarchical Bay

279 Dec 26, 2022
Location-Sensitive Visual Recognition with Cross-IOU Loss

The trained models are temporarily unavailable, but you can train the code using reasonable computational resource. Location-Sensitive Visual Recognit

Kaiwen Duan 146 Dec 25, 2022
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

MMF is a modular framework for vision and language multimodal research from Facebook AI Research. MMF contains reference implementations of state-of-t

Facebook Research 5.1k Jan 04, 2023
Session-aware Item-combination Recommendation with Transformer Network

Session-aware Item-combination Recommendation with Transformer Network 2nd place (0.39224) code and report for IEEE BigData Cup 2021 Track1 Report EDA

Tzu-Heng Lin 6 Mar 10, 2022