Official implementation of YOGO for Point-Cloud Processing

Related tags

Deep LearningYOGO
Overview

You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module

By Chenfeng Xu, Bohan Zhai, Bichen Wu, Tian Li, Wei Zhan, Peter Vajda, Kurt Keutzer, and Masayoshi Tomizuka.

This repository contains a Pytorch implementation of YOGO, a new, simple, and elegant model for point-cloud processing. The framework of our YOGO is shown below:

Selected quantitative results of different approaches on the ShapeNet and S3DIS dataset.

ShapeNet part segmentation:

Method mIoU Latency (ms) GPU Memory (GB)
PointNet 83.7 21.4 1.5
RSNet 84.9 73.8 0.8
PointNet++ 85.1 77.7 2.0
DGCNN 85.1 86.7 2.4
PointCNN 86.1 134.2 2.5
YOGO(KNN) 85.2 25.6 0.9
YOGO(Ball query) 85.1 21.3 1.0

S3DIS scene parsing:

Method mIoU Latency (ms) GPU Memory (GB)
PointNet 42.9 24.8 1.0
RSNet 51.9 111.5 1.1
PointNet++* 50.7 501.5 1.6
DGCNN 47.9 174.3 2.4
PointCNN 57.2 282.4 4.6
YOGO(KNN) 54.0 27.7 2.0
YOGO(Ball query) 53.8 24.0 2.0

For more detail, please refer to our paper: YOGO. The work is a follow-up work to SqueezeSegV3 and Visual Transformers. If you find this work useful for your research, please consider citing:

@misc{xu2021group,
      title={You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module}, 
      author={Chenfeng Xu and Bohan Zhai and Bichen Wu and Tian Li and Wei Zhan and Peter Vajda and Kurt Keutzer and Masayoshi Tomizuka},
      year={2021},
      eprint={2103.09975},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}

Related works:

@inproceedings{xu2020squeezesegv3,
  title={Squeezesegv3: Spatially-adaptive convolution for efficient point-cloud segmentation},
  author={Xu, Chenfeng and Wu, Bichen and Wang, Zining and Zhan, Wei and Vajda, Peter and Keutzer, Kurt and Tomizuka, Masayoshi},
  booktitle={European Conference on Computer Vision},
  pages={1--19},
  year={2020},
  organization={Springer}
}
@misc{wu2020visual,
      title={Visual Transformers: Token-based Image Representation and Processing for Computer Vision}, 
      author={Bichen Wu and Chenfeng Xu and Xiaoliang Dai and Alvin Wan and Peizhao Zhang and Zhicheng Yan and Masayoshi Tomizuka and Joseph Gonzalez and Kurt Keutzer and Peter Vajda},
      year={2020},
      eprint={2006.03677},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

License

YOGO is released under the BSD license (See LICENSE for details).

Installation

The instructions are tested on Ubuntu 16.04 with python 3.6 and Pytorch 1.5 with GPU support.

  • Clone the YOGO repository:
git clone https://github.com/chenfengxu714/YOGO.git
  • Use pip to install required Python packages:
pip install -r requirements.txt
  • Install KNN library:
cd convpoint/knn/
python setup.py install --home='.'

Pre-trained Models

The pre-trained YOGO is avalible at Google Drive, you can directly download them.

Inference

To infer the predictions for the entire dataset:

python train.py [config-file] --devices [gpu-ids] --evaluate --configs.evaluate.best_checkpoint_path [path to the model checkpoint]

for example, you can run the below command for ShapeNet inference:

python train.py configs/shapenet/yogo/yogo.py --devices 0 --evaluate --configs.evaluate.best_checkpoint_path ./runs/shapenet/best.pth

Training:

To train the model:

python train.py [config-file] --devices [gpu-ids] --evaluate --configs.evaluate.best_checkpoint_path [path to the model checkpoint]

for example, you can run the below command for ShapeNet training:

python train.py configs/shapenet/yogo/yogo.py --devices 0

You can run the below command for multi-gpu training:

python train.py configs/shapenet/yogo/yogo.py --devices 0,1,2,3

Note that we conduct training on Titan RTX gpu, you can modify the batch size according your GPU memory, the performance is slightly different.

Acknowledgement:

The code is modified from PVCNN and the code for KNN is from Pointconv.

Owner
Chenfeng Xu
A Ph.D. student in UC Berkeley.
Chenfeng Xu
Official repository for CVPR21 paper "Deep Stable Learning for Out-Of-Distribution Generalization".

StableNet StableNet is a deep stable learning method for out-of-distribution generalization. This is the official repo for CVPR21 paper "Deep Stable L

120 Dec 28, 2022
Model-based reinforcement learning in TensorFlow

Bellman Website | Twitter | Documentation (latest) What does Bellman do? Bellman is a package for model-based reinforcement learning (MBRL) in Python,

46 Nov 09, 2022
Py-faster-rcnn - Faster R-CNN (Python implementation)

py-faster-rcnn has been deprecated. Please see Detectron, which includes an implementation of Mask R-CNN. Disclaimer The official Faster R-CNN code (w

Ross Girshick 7.8k Jan 03, 2023
Wafer Fault Detection using MlOps Integration

Wafer Fault Detection using MlOps Integration This is an end to end machine learning project with MlOps integration for predicting the quality of wafe

Sethu Sai Medamallela 0 Mar 11, 2022
Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python = 3.6 , Pytorch

FuxiVirtualHuman 84 Jan 03, 2023
MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

Facebook Research 338 Dec 29, 2022
codebase for "A Theory of the Inductive Bias and Generalization of Kernel Regression and Wide Neural Networks"

Eigenlearning This repo contains code for replicating the experiments of the paper A Theory of the Inductive Bias and Generalization of Kernel Regress

Jamie Simon 45 Dec 02, 2022
Collection of sports betting AI tools.

sports-betting sports-betting is a collection of tools that makes it easy to create machine learning models for sports betting and evaluate their perf

George Douzas 109 Dec 31, 2022
Differential Privacy for Heterogeneous Federated Learning : Utility & Privacy tradeoffs

Differential Privacy for Heterogeneous Federated Learning : Utility & Privacy tradeoffs In this work, we propose an algorithm DP-SCAFFOLD(-warm), whic

19 Nov 10, 2022
CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Selection

CIFS This repository provides codes for CIFS (ICML 2021). CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Sel

Hanshu YAN 19 Nov 12, 2022
GAN-generated image detection based on CNNs

GAN-image-detection This repository contains a GAN-generated image detector developed to distinguish real images from synthetic ones. The detector is

Image and Sound Processing Lab 17 Dec 15, 2022
The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

Website | ArXiv | Get Start | Video PIRenderer The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic

Ren Yurui 261 Jan 09, 2023
This is a classifier which basically predicts whether there is a gun law in a state or not, depending on various things like murder rates etc.

Gun-Laws-Classifier This is a classifier which basically predicts whether there is a gun law in a state or not, depending on various things like murde

Awais Saleem 1 Jan 20, 2022
Location-Sensitive Visual Recognition with Cross-IOU Loss

The trained models are temporarily unavailable, but you can train the code using reasonable computational resource. Location-Sensitive Visual Recognit

Kaiwen Duan 146 Dec 25, 2022
Changing the Mind of Transformers for Topically-Controllable Language Generation

We will first introduce the how to run the IPython notebook demo by downloading our pretrained models. Then, we will introduce how to run our training and evaluation code.

IESL 20 Dec 06, 2022
This is the repository for our paper Ditch the Gold Standard: Re-evaluating Conversational Question Answering

Ditch the Gold Standard: Re-evaluating Conversational Question Answering This is the repository for our paper Ditch the Gold Standard: Re-evaluating C

Princeton Natural Language Processing 38 Dec 16, 2022
Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Tacotron 2 (without wavenet) PyTorch implementation of Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions. This implementati

NVIDIA Corporation 4.1k Jan 03, 2023
Official implementation for paper: Feature-Style Encoder for Style-Based GAN Inversion

Feature-Style Encoder for Style-Based GAN Inversion Official implementation for paper: Feature-Style Encoder for Style-Based GAN Inversion. Code will

InterDigital 63 Jan 03, 2023
《Towards High Fidelity Face Relighting with Realistic Shadows》(CVPR 2021)

Towards High Fidelity Face-Relighting with Realistic Shadows Andrew Hou, Ze Zhang, Michel Sarkis, Ning Bi, Yiying Tong, Xiaoming Liu. In CVPR, 2021. T

114 Dec 10, 2022