TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers.

Last update: Dec 29, 2022

Related tags

Overview

TransMVSNet

This repository contains the official implementation of the paper: "TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers."

Point cloud results on DTU, Tanks & Temples and BlendedMVS.

Change Log

Nov 30, 2021: Initialize repo.
Dec 03, 2021: Upload point cloud results on DTU, BlendedMVS and Tanks & Temples.

Comments

abs_depth_error

I find ABS_DEPTH_ERROR is close to 6 or even 7 during training, is this normal? Here are the training results for Epoch 5. Is it because of the slow convergence？

avg_test_scalars: {'loss': 4.360309665948113, 'depth_loss': 6.535046514014081, 'entropy_loss': 4.360309665948113, 'abs_depth_error': 6.899323051878795, 'thres2mm_error': 0.16829867261163733, 'thres4mm_error': 0.10954744909229193, 'thres8mm_error': 0.07844322964626443, 'thres14mm_error': 0.06323695212957076, 'thres20mm_error': 0.055751020700780536, 'thres2mm_abserror': 0.597563438798779, 'thres4mm_abserror': 2.7356186663791666, 'thres8mm_abserror': 5.608324628466483, 'thres14mm_abserror': 10.510002394554125, 'thres20mm_abserror': 16.67409769420184, 'thres>20mm_abserror': 78.15814284054947}

opened by zhang-snowy 7
About the fusion setting in DTU

Thank you for your great contribution. The script use the gipuma as the fusion method with num_consistent=5，prob_threshold=0.05，disp_threshold=0.25. However, it produces point cloud results with only 1/2 points compared with the point cloud results you provide in DTU, leading to a much poorer result in DTU. Is there any setting wrong in the script? Or because it does not use the dynamic fusion method described in the paper. Could you provide the dynamic fusion process in DTU?

opened by DIVE128 5
Testing on TnT advanced dataset

Hi, thank you for sharing this great work!

I'm try to test transmvsnet on tnt advanced dataset, but meet some problem. My test environment is ubuntu16.04 with cuda11.3 and pytorch 1.10.

The first thing is that there is no cams_1 folder under tnt dataset, is it a revised version of original cams folder or you just changed the folder name?

I just changed the folder name, then run scripts/test_tnt.sh, but I find the speed is rather slow, about 10 seconds on 1080ti for a image (1056 x 1920), is it normal?

Finally I get the fused point cloud, but the cloud is meaningless, I checked the depth map and confidence map, all of the data are very strange, apperantly not right.

Can you help me with these problems?

opened by CanCanZeng 4
Some implement details about the paper

Firstly thanks for your paper and I'm looking forward to your open-sourced code.

And I have some questions about your paper: (Hopefully you can reply, thanks in advance!) (1) In section 4.2, "The model is trained with Adam for 10 epochs with an initial learning rate of 0.001, which decays by a factor of 0.5 respectively after 6, 8, and 12 epochs." I'm confused about the epochs. And I also noticed that this training strategy is different from CasMVSNet. Did you try the training strategy in CasMVSNet? What's the difference? (2) In Table4(b), focal loss(what is the value of \gamma?) suppresses CE loss by 0.06. However, In Table4(e) and Table 6, we infer that the best model use CE loss(FL with \gamma=0). My question is: did you keep Focal loss \gamma unchanged in the Ablation study in Table4? If not, how \gamma changes? Could you elaborate?

Really appreciate it!

opened by JeffWang987 4
source code

Hi, @Lxiangyue Thank you for the nice paper.

It's been over a month since authors announced that the code will be available. May I know when the code will be released? (or whether it will not be released)

opened by Ys-Jung77 3
Testing on my own dataset

Hi thanks for your interesting work. I tested your code on one of the DTU dataset (Moda). as you can see from the following image, the results are quite well.

but I got a very bad result, when i tried to tested on one of my dataset (see the following pic) using your pretrained model (model_dtu). Now, my question is that do you thing that the object is too complicated and different compared to DTU dataset and it is all we can get from the pretrain model without retraining it? is it possible to improve by changing the input parameters? In general, would you please share your opinion about this result?

opened by AliKaramiFBK 1
generate dense 3D point cloud

thanks for your greate work I just tried to do a test on DTU testing dataset I got the depth map for each view but I got a bit confised on how to generate 3D point cloud using your code would you please let me know Best

opened by AliKaramiFBK 1
GPU memory consumption

Hi! Thanks for your excellent work! When I tested on the DTU dataset with pretrained model, the gpu memory consumption is 4439MB, but the paper gives 3778MB.

I do not know where the problem is.

opened by JianfeiJ 0
Using my own data

If I have the intrinsic matrics and extrinsic matrics of cameras, which means I don't need to run SFM in COLMAP, how should I struct my data to train the model?

opened by PaperDollssss 2
TnT dataset results

Thanks for the great job. I follow the instruction and upload the reconstruction result of tnt but find the F-score=60.29, and I find the point cloud sizes are a larger than the upload ones. Whether the reconstructed point cloud use the param settting of test_tnt.sh or it should be tuned manually? :smile:

opened by CC9310 1
TankAndTemple Test

Hi, 我测试了TAT数据集中的Family，使用的是默认脚本test_tnt.sh，采用normal融合，最近仅得到13MB点云文件。经检查发现生成的mask文件夹中的_geo.png都是大部分区域黑色图片,从而最后得到的 final.png的大部分区域都是无效的。geometric consistency阈值分别是默认的0.01和1。不知道您这边是否有一样的问题？

opened by lt-xiang 13
Why is there a big gap between the reproducing results and the paper results?

I have tried the pre-trained model you offered on DTU dataset. But the results I got are mean_acc=0.299, mean_comp=0.385, overall=0.342, and the results you presented in the paper are mean_acc=0.321, mean_comp=0.289, overall=0.305.

I do not know where the problem is.

opened by cainsmile 14

Releases(T&T_ply)

T&T_ply(Dec 3, 2021)

All point clouds of Tanks and Temples Benchmark(intermediate & advanced) reconstructed by TransMVSNet.
Source code(tar.gz)
Source code(zip)
Auditorium.ply(431.23 MB)
Ballroom.ply(1528.99 MB)
Courtroom.ply(834.75 MB)
Family.ply(803.59 MB)
Francis.ply(1282.90 MB)
Horse.ply(492.03 MB)
Lighthouse.ply(1168.33 MB)
M60.ply(1811.29 MB)
Museum.ply(1627.54 MB)
Palace.ply(1885.63 MB)
Panther.ply(1901.28 MB)
Playground.ply(1338.61 MB)
Temple.ply(884.09 MB)
Train.ply(1637.88 MB)
DTU_ply(Dec 3, 2021)

Point clouds of all 22 scans in DTU evaluation set reconstructed by TransMVSNet.
Source code(tar.gz)
Source code(zip)
mvsnet001_l3.ply(316.31 MB)
mvsnet004_l3.ply(247.56 MB)
mvsnet009_l3.ply(348.00 MB)
mvsnet010_l3.ply(329.21 MB)
mvsnet011_l3.ply(282.33 MB)
mvsnet012_l3.ply(242.44 MB)
mvsnet013_l3.ply(341.50 MB)
mvsnet015_l3.ply(284.33 MB)
mvsnet023_l3.ply(426.30 MB)
mvsnet024_l3.ply(287.40 MB)
mvsnet029_l3.ply(274.25 MB)
mvsnet032_l3.ply(241.61 MB)
mvsnet033_l3.ply(242.26 MB)
mvsnet034_l3.ply(395.63 MB)
mvsnet048_l3.ply(163.66 MB)
mvsnet049_l3.ply(213.80 MB)
mvsnet062_l3.ply(331.19 MB)
mvsnet075_l3.ply(202.86 MB)
mvsnet077_l3.ply(70.91 MB)
mvsnet110_l3.ply(307.08 MB)
mvsnet114_l3.ply(434.83 MB)
mvsnet118_l3.ply(427.05 MB)
BLD_ply(Dec 3, 2021)

Point clouds of all 7 scenes in BlendedMVS validation set reconstructed by TransMVSNet.
Source code(tar.gz)
Source code(zip)
59817e4a1bd4b175e7038d19.ply(278.67 MB)
59d2657f82ca7774b1ec081d.ply(229.97 MB)
5a6400933d809f1d8200af15.ply(427.39 MB)
5b7a3890fc8fcf6781e2593a.ply(698.58 MB)
5b950c71608de421b1e7318f.ply(228.24 MB)
5ba19a8a360c7c30c1c169df.ply(310.95 MB)
5c189f2326173c3a09ed7ef3.ply(136.71 MB)

Owner

旷视研究院 3D 组

旷视科技（Face++）研究院 3D 组（原 SLAM 组）

GitHub Repository

Language-Driven Semantic Segmentation

Language-driven Semantic Segmentation (LSeg) The repo contains official PyTorch Implementation of paper Language-driven Semantic Segmentation. Authors

416 Jan 03, 2023

TrackTech: Real-time tracking of subjects and objects on multiple cameras

TrackTech: Real-time tracking of subjects and objects on multiple cameras This project is part of the 2021 spring bachelor final project of the Bachel

5 Jun 17, 2022

PyTorch implementation of the paper: Long-tail Learning via Logit Adjustment

logit-adj-pytorch PyTorch implementation of the paper: Long-tail Learning via Logit Adjustment This code implements the paper: Long-tail Learning via

53 Dec 23, 2022

Classification Modeling: Probability of Default

Credit Risk Modeling in Python Introduction: If you've ever applied for a credit card or loan, you know that financial firms process your information

2 Nov 07, 2022

How Do Adam and Training Strategies Help BNNs Optimization? In ICML 2021.

AdamBNN This is the pytorch implementation of our paper "How Do Adam and Training Strategies Help BNNs Optimization?", published in ICML 2021. In this

47 Sep 20, 2022

Official repository for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'21, Oral Presentation)

Official PyTorch Implementation for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'2021, Oral Presentation) HOTR: End-to-

114 Nov 28, 2022

Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

CLIP-GLaSS Repository for the paper Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search An in-browser demo is

172 Dec 22, 2022

Poplar implementation of "Bundle Adjustment on a Graph Processor" (CVPR 2020)

Poplar Implementation of Bundle Adjustment using Gaussian Belief Propagation on Graphcore's IPU Implementation of CVPR 2020 paper: Bundle Adjustment o

34 Dec 05, 2022

Yolov3 pytorch implementation

YOLOV3 Pytorch实现在bubbliiing大佬代码的基础上进行了修改，添加了部分注释。预训练模型预训练模型来源于bubbliiing。链接：https://pan.baidu.com/s/1ncREw6Na9ycZptdxiVMApw 提取码：appk 训练自己的数据集按照VO

4 Aug 27, 2022

SafePicking: Learning Safe Object Extraction via Object-Level Mapping, ICRA 2022

SafePicking Learning Safe Object Extraction via Object-Level Mapping Kentaro Wad

49 Oct 24, 2022

PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

PyTorch Large-Scale Language Model A Large-Scale PyTorch Language Model trained on the 1-Billion Word (LM1B) / (GBW) dataset Latest Results 39.98 Perp

114 Nov 04, 2022

This repository contains implementations of all Machine Learning Algorithms from scratch in Python. Mathematics required for ML and many projects have also been included.

👏 Pre- requisites to Machine Learning

147 Jan 07, 2023

TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers.

Related tags

Overview

TransMVSNet

Change Log

Comments

Releases(T&T_ply)

T&T_ply(Dec 3, 2021)

DTU_ply(Dec 3, 2021)

BLD_ply(Dec 3, 2021)

Owner

旷视研究院 3D 组

Language-Driven Semantic Segmentation

TrackTech: Real-time tracking of subjects and objects on multiple cameras

PyTorch implementation of the paper: Long-tail Learning via Logit Adjustment

Classification Modeling: Probability of Default

How Do Adam and Training Strategies Help BNNs Optimization? In ICML 2021.

Official repository for HOTR: End-to-End Human-Object Interaction Detection with Transformers (CVPR'21, Oral Presentation)

Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

Poplar implementation of "Bundle Adjustment on a Graph Processor" (CVPR 2020)

Yolov3 pytorch implementation

SafePicking: Learning Safe Object Extraction via Object-Level Mapping, ICRA 2022

PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

This repository contains implementations of all Machine Learning Algorithms from scratch in Python. Mathematics required for ML and many projects have also been included.

I tried to apply the CAM algorithm to YOLOv4 and it worked.

Contrastive Learning Inverts the Data Generating Process

This is the dataset and code release of the OpenRooms Dataset.

social humanoid robots with GPGPU and IoT

Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals

Natural Intelligence is still a pretty good idea.

Gems & Holiday Package Prediction

Implementation of the Point Transformer layer, in Pytorch