Official implementation of YOGO for Point-Cloud Processing

Related tags

Deep LearningYOGO
Overview

You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module

By Chenfeng Xu, Bohan Zhai, Bichen Wu, Tian Li, Wei Zhan, Peter Vajda, Kurt Keutzer, and Masayoshi Tomizuka.

This repository contains a Pytorch implementation of YOGO, a new, simple, and elegant model for point-cloud processing. The framework of our YOGO is shown below:

Selected quantitative results of different approaches on the ShapeNet and S3DIS dataset.

ShapeNet part segmentation:

Method mIoU Latency (ms) GPU Memory (GB)
PointNet 83.7 21.4 1.5
RSNet 84.9 73.8 0.8
PointNet++ 85.1 77.7 2.0
DGCNN 85.1 86.7 2.4
PointCNN 86.1 134.2 2.5
YOGO(KNN) 85.2 25.6 0.9
YOGO(Ball query) 85.1 21.3 1.0

S3DIS scene parsing:

Method mIoU Latency (ms) GPU Memory (GB)
PointNet 42.9 24.8 1.0
RSNet 51.9 111.5 1.1
PointNet++* 50.7 501.5 1.6
DGCNN 47.9 174.3 2.4
PointCNN 57.2 282.4 4.6
YOGO(KNN) 54.0 27.7 2.0
YOGO(Ball query) 53.8 24.0 2.0

For more detail, please refer to our paper: YOGO. The work is a follow-up work to SqueezeSegV3 and Visual Transformers. If you find this work useful for your research, please consider citing:

@misc{xu2021group,
      title={You Only Group Once: Efficient Point-Cloud Processing with Token Representation and Relation Inference Module}, 
      author={Chenfeng Xu and Bohan Zhai and Bichen Wu and Tian Li and Wei Zhan and Peter Vajda and Kurt Keutzer and Masayoshi Tomizuka},
      year={2021},
      eprint={2103.09975},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}

Related works:

@inproceedings{xu2020squeezesegv3,
  title={Squeezesegv3: Spatially-adaptive convolution for efficient point-cloud segmentation},
  author={Xu, Chenfeng and Wu, Bichen and Wang, Zining and Zhan, Wei and Vajda, Peter and Keutzer, Kurt and Tomizuka, Masayoshi},
  booktitle={European Conference on Computer Vision},
  pages={1--19},
  year={2020},
  organization={Springer}
}
@misc{wu2020visual,
      title={Visual Transformers: Token-based Image Representation and Processing for Computer Vision}, 
      author={Bichen Wu and Chenfeng Xu and Xiaoliang Dai and Alvin Wan and Peizhao Zhang and Zhicheng Yan and Masayoshi Tomizuka and Joseph Gonzalez and Kurt Keutzer and Peter Vajda},
      year={2020},
      eprint={2006.03677},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

License

YOGO is released under the BSD license (See LICENSE for details).

Installation

The instructions are tested on Ubuntu 16.04 with python 3.6 and Pytorch 1.5 with GPU support.

  • Clone the YOGO repository:
git clone https://github.com/chenfengxu714/YOGO.git
  • Use pip to install required Python packages:
pip install -r requirements.txt
  • Install KNN library:
cd convpoint/knn/
python setup.py install --home='.'

Pre-trained Models

The pre-trained YOGO is avalible at Google Drive, you can directly download them.

Inference

To infer the predictions for the entire dataset:

python train.py [config-file] --devices [gpu-ids] --evaluate --configs.evaluate.best_checkpoint_path [path to the model checkpoint]

for example, you can run the below command for ShapeNet inference:

python train.py configs/shapenet/yogo/yogo.py --devices 0 --evaluate --configs.evaluate.best_checkpoint_path ./runs/shapenet/best.pth

Training:

To train the model:

python train.py [config-file] --devices [gpu-ids] --evaluate --configs.evaluate.best_checkpoint_path [path to the model checkpoint]

for example, you can run the below command for ShapeNet training:

python train.py configs/shapenet/yogo/yogo.py --devices 0

You can run the below command for multi-gpu training:

python train.py configs/shapenet/yogo/yogo.py --devices 0,1,2,3

Note that we conduct training on Titan RTX gpu, you can modify the batch size according your GPU memory, the performance is slightly different.

Acknowledgement:

The code is modified from PVCNN and the code for KNN is from Pointconv.

Owner
Chenfeng Xu
A Ph.D. student in UC Berkeley.
Chenfeng Xu
An LSTM for time-series classification

Update 10-April-2017 And now it works with Python3 and Tensorflow 1.1.0 Update 02-Jan-2017 I updated this repo. Now it works with Tensorflow 0.12. In

Rob Romijnders 391 Dec 27, 2022
Contrastive Learning for Compact Single Image Dehazing, CVPR2021

AECR-Net Contrastive Learning for Compact Single Image Dehazing, CVPR2021. Official Pytorch based implementation. Paper arxiv Pytorch Version TODO: mo

glassy 253 Jan 01, 2023
Code and Data for the paper: Molecular Contrastive Learning with Chemical Element Knowledge Graph [AAAI 2022]

Knowledge-enhanced Contrastive Learning (KCL) Molecular Contrastive Learning with Chemical Element Knowledge Graph [ AAAI 2022 ]. We construct a Chemi

Fangyin 58 Dec 26, 2022
An open-source project for applying deep learning to medical scenarios

Auto Vaidya An open source solution for creating end-end web app for employing the power of deep learning in various clinical scenarios like implant d

Smaranjit Ghose 18 May 29, 2022
VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning

    VarCLR: Variable Representation Pre-training via Contrastive Learning New: Paper accepted by ICSE 2022. Preprint at arXiv! This repository contain

squaresLab 32 Oct 24, 2022
N-gram models- Unsmoothed, Laplace, Deleted Interpolation

N-gram models- Unsmoothed, Laplace, Deleted Interpolation

Ravika Nagpal 1 Jan 04, 2022
A simple Neural Network that predicts the label for a series of handwritten digits

Neural_Network A simple Neural Network that predicts the label for a series of handwritten numbers This program tries to predict the label (1,2,3 etc.

Ty 1 Dec 18, 2021
Official implement of Paper:A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sening images

A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images 深度监督影像融合网络DSIFN用于高分辨率双时相遥感影像变化检测 Of

Chenxiao Zhang 135 Dec 19, 2022
Stream images from a connected camera over MQTT, view using Streamlit, record to file and sqlite

mqtt-camera-streamer Summary: Publish frames from a connected camera or MJPEG/RTSP stream to an MQTT topic, and view the feed in a browser on another

Robin Cole 183 Dec 16, 2022
NAS-FCOS: Fast Neural Architecture Search for Object Detection (CVPR 2020)

NAS-FCOS: Fast Neural Architecture Search for Object Detection This project hosts the train and inference code with pretrained model for implementing

Ning Wang 180 Dec 06, 2022
Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

RAVE: Realtime Audio Variational autoEncoder Official implementation of RAVE: A variational autoencoder for fast and high-quality neural audio synthes

ACIDS 587 Jan 01, 2023
Underwater industrial application yolov5m6

This project wins the intelligent algorithm contest finalist award and stands out from over 2000teams in China Underwater Robot Professional Contest, entering the final of China Underwater Robot Prof

8 Nov 09, 2022
Flaxformer: transformer architectures in JAX/Flax

Flaxformer is a transformer library for primarily NLP and multimodal research at Google.

Google 116 Jan 05, 2023
A simple API wrapper for Discord interactions.

Your ultimate Discord interactions library for discord.py. About | Installation | Examples | Discord | PyPI About What is discord-py-interactions? dis

james 641 Jan 03, 2023
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.

AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.

Adelaide Intelligent Machines (AIM) Group 3k Jan 02, 2023
Real-time pose estimation accelerated with NVIDIA TensorRT

trt_pose Want to detect hand poses? Check out the new trt_pose_hand project for real-time hand pose and gesture recognition! trt_pose is aimed at enab

NVIDIA AI IOT 803 Jan 06, 2023
An open-source outlier detection package by Getcontact Data Team

pyfbad The pyfbad library supports anomaly detection projects. An end-to-end anomaly detection application can be written using the source codes of th

Teknasyon Tech 41 Dec 27, 2022
Interactive Image Segmentation via Backpropagating Refinement Scheme

Won-Dong Jang and Chang-Su Kim, Interactive Image Segmentation via Backpropagating Refinement Scheme, CVPR 2019

Won-Dong Jang 85 Sep 15, 2022
DI-smartcross - Decision Intelligence Platform for Traffic Crossing Signal Control

DI-smartcross DI-smartcross - Decision Intelligence Platform for Traffic Crossin

OpenDILab 213 Jan 02, 2023
ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information

ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information This repository contains code, model, dataset for ChineseBERT at ACL2021. Ch

413 Dec 01, 2022