Code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction

Last update: Dec 16, 2022

Related tags

Overview

Official PyTorch code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction.
Guanglei Yang, Hao Tang, Mingli Ding, Nicu Sebe, Elisa Ricci.
Apply Transformer into depth predciton and surface normal estimation.

Prepare pretrain model

we choose R50-ViT-B_16 as our encoder.

wget https://storage.googleapis.com/vit_models/imagenet21k/R50+ViT-B_16.npz 
mkdir ./model/vit_checkpoint/imagenet21k 
mv R50+ViT-B_16.npz ./model/vit_checkpoint/imagenet21k/R50+ViT-B_16.npz

Prepare Dateset

prepare nyu

mkdir -p pytorch/dataset/nyu_depth_v2
python utils/download_from_gdrive.py 1AysroWpfISmm-yRFGBgFTrLy6FjQwvwP pytorch/dataset/nyu_depth_v2/sync.zip
cd pytorch/dataset/nyu_depth_v2
unzip sync.zip

prepare kitti

cd dataset
mkdir kitti_dataset
cd kitti_dataset
### image move kitti_archives_to_download.txt into kitti_dataset
wget -i kitti_archives_to_download.txt

### label
wget https://s3.eu-central-1.amazonaws.com/avg-kitti/data_depth_annotated.zip
unzip data_depth_annotated.zip
cd train
mv * ../
cd ..  
rm -r train
cd val
mv * ../
cd ..
rm -r val
rm data_depth_annotated.zip

Environment

pip install -r requirement.txt

Run

Train

CUDA_VISIBLE_DEVICES=0,1,2,3 python bts_main.py arguments_train_nyu.txt
CUDA_VISIBLE_DEVICES=0,1,2,3 python bts_main.py arguments_train_eigen.txt

Test: Pick up nice result

CUDA_VISIBLE_DEVICES=1 python bts_test.py arguments_test_nyu.txt
python ../utils/eval_with_pngs.py --pred_path vis_att_bts_nyu_v2_pytorch_att/raw/ --gt_path ../../dataset/nyu_depth_v2/official_splits/test/ --dataset nyu --min_depth_eval 1e-3 --max_depth_eval 10 --eigen_crop
CUDA_VISIBLE_DEVICES=1 python bts_test.py arguments_test_eigen.txt
python ../utils/eval_with_pngs.py --pred_path vis_att_bts_eigen_v2_pytorch_att/raw/ --gt_path ./dataset/kitti_dataset/ --dataset kitti --min_depth_eval 1e-3 --max_depth_eval 80 --do_kb_crop --garg_crop

Debug

CUDA_VISIBLE_DEVICES=1 python bts_main.py arguments_train_nyu_debug.txt

Download Pretrained Model

sh scripts/download_TransDepth_model.sh kitti_depth

sh scripts/download_TransDepth_model.sh nyu_depth

sh scripts/download_TransDepth_model.sh nyu_surfacenormal

Reference

BTS

ViT

Do‘s code

Visualization result share

We provide all vis result of all tasks. link

Code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction

Related tags

Overview

Prepare pretrain model

Prepare Dateset

prepare nyu

prepare kitti

Environment

Run

Download Pretrained Model

Reference

Visualization result share

Owner

stanley

Predict stock movement with Machine Learning and Deep Learning algorithms

An official PyTorch Implementation of Boundary-aware Self-supervised Learning for Video Scene Segmentation (BaSSL)

This repository contains numerical implementation for the paper Intertemporal Pricing under Reference Effects: Integrating Reference Effects and Consumer Heterogeneity.

Official PyTorch implementation of "Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets" (ICLR 2021)

Digitalizing-Prescription-Image - PIRDS - Prescription Image Recognition and Digitalizing System is a OCR make with Tensorflow

Kinetics-Data-Preprocessing

Weighing Counts: Sequential Crowd Counting by Reinforcement Learning

Python library for computer vision labeling tasks. The core functionality is to translate bounding box annotations between different formats-for example, from coco to yolo.

Expand human face editing via Global Direction of StyleCLIP, especially to maintain similarity during editing.

Implementation of Nalbach et al. 2017 paper.

Task Transformer Network for Joint MRI Reconstruction and Super-Resolution (MICCAI 2021)

Official implementation of "A Unified Objective for Novel Class Discovery", ICCV2021 (Oral)

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

NLMpy - A Python package to create neutral landscape models

PyTorch trainer and model for Sequence Classification

Deploy a ML inference service on a budget in less than 10 lines of code.

OpenLT: An open-source project for long-tail classification

Many Class Activation Map methods implemented in Pytorch for CNNs and Vision Transformers. Including Grad-CAM, Grad-CAM++, Score-CAM, Ablation-CAM and XGrad-CAM

GEP (GDB Enhanced Prompt) - a GDB plug-in for GDB command prompt with fzf history search, fish-like autosuggestions, auto-completion with floating window, partial string matching in history, and more!