Kernel Point Convolutions

Related tags

Deep LearningKPConv
Overview

Intro figure

Created by Hugues THOMAS

Introduction

Update 27/04/2020: New PyTorch implementation available. With SemanticKitti, and Windows supported.

This repository contains the implementation of Kernel Point Convolution (KPConv), a point convolution operator presented in our ICCV2019 paper (arXiv). If you find our work useful in your research, please consider citing:

@article{thomas2019KPConv,
    Author = {Thomas, Hugues and Qi, Charles R. and Deschaud, Jean-Emmanuel and Marcotegui, Beatriz and Goulette, Fran{\c{c}}ois and Guibas, Leonidas J.},
    Title = {KPConv: Flexible and Deformable Convolution for Point Clouds},
    Journal = {Proceedings of the IEEE International Conference on Computer Vision},
    Year = {2019}
}

Update 03/05/2019, bug found with TF 1.13 and CUDA 10. We found an internal bug inside tf.matmul operation. It returns absurd values like 1e12, leading to the apparition of NaNs in our network. We advise to use the code with CUDA 9.0 and TF 1.12. More info in issue #15

SemanticKitti Code: You can download the code used for SemanticKitti submission here. It is not clean, has very few explanations, and and could be buggy. Use it only if you are familiar with KPConv implementation.

Installation

A step-by-step installation guide for Ubuntu 16.04 is provided in INSTALL.md. Windows is currently not supported as the code uses tensorflow custom operations.

Experiments

We provide scripts for many experiments. The instructions to run these experiments are in the doc folder.

  • Object Classification: Instructions to train KP-CNN on an object classification task (Modelnet40).

  • Object Segmentation: Instructions to train KP-FCNN on an object segmentation task (ShapeNetPart)

  • Scene Segmentation: Instructions to train KP-FCNN on several scene segmentation tasks (S3DIS, Scannet, Semantic3D, NPM3D).

  • New Dataset: Instructions to train KPConv networks on your own data.

  • Pretrained models: We provide pretrained weights and instructions to load them.

  • Visualization scripts: Instructions to use the three scripts allowing to visualize: the learned features, the kernel deformations and the Effective Receptive Fields.

Performances

The following tables report the current performances on different tasks and datasets. Some scores have been improved since the article submission.

Classification and segmentation of 3D shapes

Method ModelNet40 OA ShapeNetPart classes mIoU ShapeNetPart instances mIoU
KPConv rigid 92.9% 85.0% 86.2%
KPConv deform 92.7% 85.1% 86.4%

Segmentation of 3D scenes

Method Scannet mIoU Sem3D mIoU S3DIS mIoU NPM3D mIoU
KPConv rigid 68.6% 74.6% 65.4% 72.3%
KPConv deform 68.4% 73.1% 67.1% 82.0%

Acknowledgment

Our code uses the nanoflann library.

License

Our code is released under MIT License (see LICENSE file for details).

Updates

  • 17/02/2020: Added a link to SemanticKitti code
  • 24/01/2020: Bug fixes
  • 01/10/2019: Adding visualization scripts.
  • 23/09/2019: Adding pretrained models for NPM3D and S3DIS datasets.
  • 03/05/2019: Bug found with TF 1.13 and CUDA 10.
  • 19/04/2019: Initial release.
Owner
Hugues THOMAS
AI/robotics Researcher. Postdoc at University of Toronto. Focus: Deep Learning and 3D Point clouds. Indoor navigation
Hugues THOMAS
End-to-end machine learning project for rices detection

Basmatinet Welcome to this project folks ! Whether you like it or not this project is all about riiiiice or riz in french. It is also about Deep Learn

Béranger 47 Jun 18, 2022
Library for converting from RGB / GrayScale image to base64 and back.

Library for converting RGB / Grayscale numpy images from to base64 and back. Installation pip install -U image_to_base_64 Conversion RGB to base 64 b

Vladimir Iglovikov 16 Aug 28, 2022
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

About This repository provides data and code for the paper: Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development (subm

Appen Repos 86 Dec 07, 2022
CycleTransGAN-EVC: A CycleGAN-based Emotional Voice Conversion Model with Transformer

CycleTransGAN-EVC CycleTransGAN-EVC: A CycleGAN-based Emotional Voice Conversion Model with Transformer Demo emotion CycleTransGAN CycleTransGAN Cycle

24 Dec 15, 2022
The code of paper 'Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection'

Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection Pytorch implemetation of paper 'Learning to Aggregate and Personalize

Tencent YouTu Research 136 Dec 29, 2022
A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Segnet is deep fully convolutional neural network architecture for semantic pixel-wise segmentation. This is implementation of http://arxiv.org/pdf/15

Pradyumna Reddy Chinthala 190 Dec 15, 2022
NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

880 Jan 07, 2023
A Quick and Dirty Progressive Neural Network written in TensorFlow.

prog_nn .▄▄ · ▄· ▄▌ ▐ ▄ ▄▄▄· ▐ ▄ ▐█ ▀. ▐█▪██▌•█▌▐█▐█ ▄█▪ •█▌▐█ ▄▀▀▀█▄▐█▌▐█▪▐█▐▐▌ ██▀

SynPon 53 Dec 12, 2022
This is official implementaion of paper "Token Shift Transformer for Video Classification".

This is official implementaion of paper "Token Shift Transformer for Video Classification". We achieve SOTA performance 80.40% on Kinetics-400 val. Paper link

VideoNet 60 Dec 30, 2022
Detect roadway lanes using Python OpenCV for project during the 5th semester at DHBW Stuttgart for lecture in digital image processing.

Find Line Detection (Image Processing) Identifying lanes of the road is very common task that human driver performs. It's important to keep the vehicl

LMF 4 Jun 21, 2022
A Gura parser implementation for Python

Gura Python parser This repository contains the implementation of a Gura (compliant with version 1.0.0) format parser in Python. Installation pip inst

Gura Config Lang 19 Jan 25, 2022
Knowledge Distillation Toolbox for Semantic Segmentation

SegDistill: Toolbox for Knowledge Distillation on Semantic Segmentation Networks This repo contains the supported code and configuration files for Seg

9 Dec 12, 2022
A multi-mode modulator for multi-domain few-shot classification (ICCV)

A multi-mode modulator for multi-domain few-shot classification (ICCV)

Yanbin Liu 8 Apr 28, 2022
3ds-Ghidra-Scripts - Ghidra scripts to help with 3ds reverse engineering

3ds Ghidra Scripts These are ghidra scripts to help with 3ds reverse engineering

Zak 7 May 23, 2022
Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)

DocFormer - PyTorch Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for t

171 Jan 06, 2023
An open software package to develop BCI based brain and cognitive computing technology for recognizing user's intention using deep learning

An open software package to develop BCI based brain and cognitive computing technology for recognizing user's intention using deep learning

deepbci 272 Jan 08, 2023
Anomaly Detection Based on Hierarchical Clustering of Mobile Robot Data

We proposed a new approach to detect anomalies of mobile robot data. We investigate each data seperately with two clustering method hierarchical and k-means. There are two sub-method that we used for

Zekeriyya Demirci 1 Jan 09, 2022
This is the source code for generating the ASL-Skeleton3D and ASL-Phono datasets. Check out the README.md for more details.

ASL-Skeleton3D and ASL-Phono Datasets Generator The ASL-Skeleton3D contains a representation based on mapping into the three-dimensional space the coo

Cleison Amorim 5 Nov 20, 2022
Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.

aft-pytorch Unofficial PyTorch implementation of Attention Free Transformer's layers by Zhai, et al. [abs, pdf] from Apple Inc. Installation You can i

Rishabh Anand 184 Dec 12, 2022
Progressive Domain Adaptation for Object Detection

Progressive Domain Adaptation for Object Detection Implementation of our paper Progressive Domain Adaptation for Object Detection, based on pytorch-fa

96 Nov 25, 2022