PN-Net a neural field-based framework for depth estimation from single-view RGB images.

Last update: Oct 02, 2021

Related tags

Overview

PN-Net

We present a neural field-based framework for depth estimation from single-view RGB images. Rather than representing a 2D depth map as a single channel image, we define it as the iso-surface of a scalar field in an implicit space, which we introduce as the Pseudo 3D Space. We convert a 3D Depth Field into a 2D depth image utilizing an efficient and differentiable sphere tracing rendering algorithm. We introduce two further innovations. First, we present a Field Warping technique that simplifies the depth field estimation as a classification problem, which is far more efficient to learn than a regression task of learning a signed distance function (SDF). Second, we design the 3D Pseudo Normal from the 2D depth map, which is closely related to the actual 3D surface normal and can be computed from the depth field's implicit representation with an uncalibrated camera. Experiments validated our method's performance. Our Pseudo 3D Space simplifies the current implicit field learning and offers a consistent framework for advancing shape reconstruction from multiple cues.

Set up dataset path

Suppose your dataset is placed like this:

/absolute_path/bts_nyu_data/
    sync/
        ...
    official_splits/
        train/
            ...
        test/
            ...

Add in ~/.bashrc the following

export PNNET_NYU2_DATASET=/absolute_path/bts_nyu_data/

Train with

python train_bts_nyu_nd3.py -c configs/train_bts_nyu_nd3_tb_vis.json

This include pseudo normal and total bending loss.

PN-Net a neural field-based framework for depth estimation from single-view RGB images.

Related tags

Overview

PN-Net

Set up dataset path

Train with

Owner

End-to-end image segmentation kit based on PaddlePaddle.

PyTorch reimplementation of the Smooth ReLU activation function proposed in the paper "Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations" [arXiv 2022].

Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes

You Only Look One-level Feature (YOLOF), CVPR2021, Detectron2

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt

Project repo for the paper SILT: Self-supervised Lighting Transfer Using Implicit Image Decomposition

African language Speech Recognition - Speech-to-Text

the code for paper "Energy-Based Open-World Uncertainty Modeling for Confidence Calibration"

TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition

Official repo of the paper "Surface Form Competition: Why the Highest Probability Answer Isn't Always Right"

Ontologysim: a Owlready2 library for applied production simulation

PyTorch code to run synthetic experiments.

LSTMs (Long Short Term Memory) RNN for prediction of price trends

FcaNet: Frequency Channel Attention Networks

MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in Tensorflow Lite.

PyTorch implementation of the Pose Residual Network (PRN)

How to Leverage Multimodal EHR Data for Better Medical Predictions?

hipCaffe: the HIP port of Caffe

MPLP: Metapath-Based Label Propagation for Heterogenous Graphs