Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Last update: Dec 26, 2022

Overview

Bidirectional Projection Network for Cross Dimension Scene Understanding

CVPR 2021 (Oral)

Existing segmentation methods are mostly unidirectional, i.e. utilizing 3D for 2D segmentation or vice versa. Obviously 2D and 3D information can nicely complement each other in both directions, during the segmentation. This is the goal of bidirectional projection network.

Environment

Main

# Torch
$ pip install torch==1.4.0+cu100 torchvision==0.5.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html
# MinkowskiEngine 0.4.1
$ conda install numpy openblas
$ git clone https://github.com/StanfordVL/MinkowskiEngine.git
$ cd MinkowskiEngine
$ git checkout f1a419cc5792562a06df9e1da686b7ce8f3bb5ad
$ python setup.py install
# Others
$ pip install imageio==2.8.0 opencv-python==4.2.0.32 pillow==7.0.0 pyyaml==5.3 scipy==1.4.1 sharedarray==3.2.0 tensorboardx==2.0 tqdm==4.42.1

Others

Please refer to env.yml for details.

Prepare data

Download the dataset from official website.
2D: The scripts is from 3DMV repo, it is based on python2, other code in this repo is based on python3 python prepare_2d_data.py --scannet_path data/scannetv2 --output_path data/scannetv2_images --export_label_images
3D: dataset/preprocess_3d_scannet.py

Config

BPNet_5cm: config/scannet/bpnet_5cm.yaml

Training

Download pretrained 2D ResNets on ImageNet from PyTorch website, and put them into the initmodel folder.

model_urls = {
    'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth',
    'resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth',
    'resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth',
    'resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth',
    'resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth',
}

Start training: sh tool/train.sh EXP_NAME /PATH/TO/CONFIG NUMBER_OF_THREADS
Resume: sh tool/resume.sh EXP_NAME /PATH/TO/CONFIG(copied one) NUMBER_OF_THREADS

NUMBER_OF_THREADS is the threads to use per process (gpu), so optimally, it should be Total_threads / gpu_number_used

Testing

Testing using your trained model or our pre-trained model (voxel_size: 5cm): sh tool/test.sh EXP_NAME /PATH/TO/CONFIG(copied one) NUMBER_OF_THREADS)

Copyright and License

You are granted with the LICENSE for both academic and commercial usages.

Acknowledgment

Our code is based on MinkowskiEngine. We also referred to SparseConvNet and semseg.

Citation

@inproceedings{hu-2021-bidirectional,
        author      = {Wenbo Hu, Hengshuang Zhao, Li Jiang, Jiaya Jia and Tien-Tsin Wong},
        title       = {Bidirectional Projection Network for Cross Dimensional Scene Understanding},
        booktitle   = {CVPR},
        year        = {2021}
    }

Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Related tags

Overview

Bidirectional Projection Network for Cross Dimension Scene Understanding

Environment

Prepare data

Config

Training

Testing

Copyright and License

Acknowledgment

Citation

Owner

Hu Wenbo

Libraries, tools and tasks created and used at DeepMind Robotics.

Can we do Customers Segmentation using PHP and Unsupervized Machine Learning ? Yes we can ! 🤡

The hippynn python package - a modular library for atomistic machine learning with pytorch.

Pytorch Implementation of LNSNet for Superpixel Segmentation

Code for the paper titled "Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks" (NeurIPS 2021 Spotlight).

Robust Lane Detection via Expanded Self Attention (WACV 2022)

Yolo Traffic Light Detection With Python

Official implementation of Deep Convolutional Dictionary Learning for Image Denoising.

The official PyTorch implementation for the paper "sMGC: A Complex-Valued Graph Convolutional Network via Magnetic Laplacian for Directed Graphs".

Tiny Object Detection in Aerial Images.

Official Implementation of "Transformers Can Do Bayesian Inference"

Convert game ISO and archives to CD CHD for emulation on Linux.

Python library for computer vision labeling tasks. The core functionality is to translate bounding box annotations between different formats-for example, from coco to yolo.

CSPML (crystal structure prediction with machine learning-based element substitution)

Efficient Deep Learning Systems course

Official pytorch code for SSAT: A Symmetric Semantic-Aware Transformer Network for Makeup Transfer and Removal

一些经典的CTR算法的复现; LR, FM, FFM, AFM, DeepFM，xDeepFM, PNN, DCN, DCNv2, DIFM, AutoInt, FiBiNet,AFN,ONN,DIN, DIEN ... （pytorch, tf2.0）

Detectron2 for Document Layout Analysis

Official implementation of "SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers"

A higher performance pytorch implementation of DeepLab V3 Plus(DeepLab v3+)