Public repository of the 3DV 2021 paper "Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds"

Related tags

Deep Learning3DGenZ
Overview

Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds

Björn Michele1), Alexandre Boulch1), Gilles Puy1), Maxime Bucher1) and Renaud Marlet1)2)

1) Valeo.ai 2)LIGM, Ecole des Ponts, Univ Gustave Eiffel, CNRS, Marne-la-Vallée, Franc

Accepted at 3DV 2021
Arxiv: Paper and Supp.
Poster or Presentation

Abstract: While there has been a number of studies on Zero-Shot Learning (ZSL) for 2D images, its application to 3D data is still recent and scarce, with just a few methods limited to classification. We present the first generative approach for both ZSL and Generalized ZSL (GZSL) on 3D data, that can handle both classification and, for the first time, semantic segmentation. We show that it reaches or outperforms the state of the art on ModelNet40 classification for both inductive ZSL and inductive GZSL. For semantic segmentation, we created three benchmarks for evaluating this new ZSL task, using S3DIS, ScanNet and SemanticKITTI. Our experiments show that our method outperforms strong baselines, which we additionally propose for this task.

If you want to cite this work:

@inproceedings{michele2021generative,
  title={Generative Zero-Shot Learning for Semantic Segmentation of {3D} Point Cloud},
  author={Michele, Bj{\"o}rn and Boulch, Alexandre and Puy, Gilles and Bucher, Maxime and Marlet, Renaud},
  booktitle={International Conference on 3D Vision (3DV)},
  year={2021}

Code

We provide in this repository the code and the pretrained models for the semantic segmentation tasks on SemanticKITTI and ScanNet.

To-Do:

  • We will add more experiments in the future (You could "watch" the repo to stay updated).

Code Semantic Segmentation

Installation

Dependencies: Please see requirements.txt for all needed code libraries. Tested with: Pytorch 1.6.0 and 1.7.1 (both Cuda 10.1). As torch-geometric is needed Pytoch >= 1.4.0 is required.

  1. Clone this repository.

  2. Download and/or install the backbones (ConvPoint is also necessary for our adaption of FKAConv. More information: ConvPoint, FKAConv, KP-Conv).

    • For ConvPoint:
    cd 3DGenZ/genz3d/convpoint/convpoint/knn
    python3 setup.py install --home="."
    
    • For FKAConv:
    cd 3DGenZ/genz3d/fkaconv
    pip install -ve . 
    
  3. Download the datasets.

    • For an out of the box start we recommend the following folder structure.
    ~/3DGenZ
    ~/data/scannet/
    ~/data/semantic_kitti/
    
  4. Download the semantic word embeddings and the pretrained backbones.

    • Place the semantic word embeddings in
    3DGenZ/genz3d/word_representations/
    
    • For SN, the pre-trained backbone model and the config file, are placed in
    3DGenZ/genz3d/fkaconv/examples/scannet/FKAConv_scannet_ZSL4
    

    The complete ZSL-trained model cpkt is placed in (create the folder if necessary)

    3DGenZ/genz3d/seg/run/scannet/
    
    • For SK, the pre-trained backbone-model, the "Log-..." folder is placed in
    3DGenZ/genz3d/kpconv/results
    

    And the complete ZSL-trained model ckpt is placed in

    3DGenZ/genz3d/seg/run/sk
    

Run training and evalutation

  1. Training (Classifier layer): In 3DGenZ/genz3d/seg/ you find for each of the datasets a folder with scripts to run the generator and classificator training.(see: SN,SK)
    • Alternatively, you can use the pretrained models from us.
  2. Evalutation: Is done with the evaluation functions of the backbones. (see: SN_eval, KP-Conv_eval)

Backbones

For the datasets we used different backbones, for which we highly rely on their code basis. In order to adapt them to the ZSL setting we made the change that during the backbone training no crops of point clouds with unseen classes are shown (if there is a single unseen class

  • ConvPoint [1] for the S3DIS dataset (and also partly used for the ScanNet dataset).
  • FKAConv [2] for the ScanNet dataset.
  • KPConv [3] for the SemanticKITTI dataset.

Datasets

For semantic segmentation we did experiments on 3 datasets.

  • SemanticKITTI [4][5].
  • S3DIS [6].
  • ScanNet[7].

Acknowledgements

For the Generator Training we use parts of the code basis of ZS3.
For the backbones we use the code of ConvPoint, FKAConv and KPConv.

References

[1] Boulch, A. (2020). ConvPoint: Continuous convolutions for point cloud processing. Computers & Graphics, 88, 24-34.
[2] Boulch, A., Puy, G., & Marlet, R. (2020). FKAConv: Feature-kernel alignment for point cloud convolution. In Proceedings of the Asian Conference on Computer Vision.
[3] Thomas, H., Qi, C. R., Deschaud, J. E., Marcotegui, B., Goulette, F., & Guibas, L. J. (2019). Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 6411-6420).
[4] Behley, J., Garbade, M., Milioto, A., Quenzel, J., Behnke, S., Stachniss, C., & Gall, J. (2019). Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 9297-9307).
[5] Geiger, A., Lenz, P., & Urtasun, R. (2012, June). Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recognition (pp. 3354-3361). IEEE.
[6] Armeni, I., Sener, O., Zamir, A. R., Jiang, H., Brilakis, I., Fischer, M., & Savarese, S. (2016). 3d semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1534-1543).
[7] Dai, A., Chang, A. X., Savva, M., Halber, M., Funkhouser, T., & Nießner, M. (2017). Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5828-5839).

Updates

9.12.2021 Initial Code release

Licence

3DGenZ is released under the Apache 2.0 license.

The folder 3DGenZ/genz3d/kpconv includes large parts of code taken from KP-Conv and is therefore distributed under the MIT Licence. See the LICENSE for this folder.

The folder 3DGenZ/genz3d/seg/utils also includes files taken from https://github.com/jfzhang95/pytorch-deeplab-xception and is therefore also distributed under the MIT License. See the LICENSE for these files.

Owner
valeo.ai
We are an international team based in Paris, conducting AI research for Valeo automotive applications, in collaboration with world-class academics.
valeo.ai
基于深度强化学习的原神自动钓鱼AI

原神自动钓鱼AI由YOLOX, DQN两部分模型组成。使用迁移学习,半监督学习进行训练。 模型也包含一些使用opencv等传统数字图像处理方法实现的不可学习部分。

4.2k Jan 01, 2023
A simple but complete full-attention transformer with a set of promising experimental features from various papers

x-transformers A concise but fully-featured transformer, complete with a set of promising experimental features from various papers. Install $ pip ins

Phil Wang 2.3k Jan 03, 2023
Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN", accepted to ACM MM 2021 BNI Track.

RecycleD Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN

Yunan Zhu 23 Nov 05, 2022
Malware Bypass Research using Reinforcement Learning

Malware Bypass Research using Reinforcement Learning

Bobby Filar 76 Dec 26, 2022
PSPNet in Chainer

PSPNet This is an unofficial implementation of Pyramid Scene Parsing Network (PSPNet) in Chainer. Training Requirement Python 3.4.4+ Chainer 3.0.0b1+

Shunta Saito 76 Dec 12, 2022
Boosted CVaR Classification (NeurIPS 2021)

Boosted CVaR Classification Runtian Zhai, Chen Dan, Arun Sai Suggala, Zico Kolter, Pradeep Ravikumar NeurIPS 2021 Table of Contents Quick Start Train

Runtian Zhai 4 Feb 15, 2022
Official implementation of the paper 'High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network' in CVPR 2021

LPTN Paper | Supplementary Material | Poster High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network Ji

372 Dec 26, 2022
[NeurIPS-2021] Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data

MosaicKD Code for NeurIPS-21 paper "Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data" 1. Motivation Natural images share common l

ZJU-VIPA 37 Nov 10, 2022
Autonomous Ground Vehicle Navigation and Control Simulation Examples in Python

Autonomous Ground Vehicle Navigation and Control Simulation Examples in Python THIS PROJECT IS CURRENTLY A WORK IN PROGRESS AND THUS THIS REPOSITORY I

Joshua Marshall 14 Dec 31, 2022
Semi Supervised Learning for Medical Image Segmentation, a collection of literature reviews and code implementations.

Semi-supervised-learning-for-medical-image-segmentation. Recently, semi-supervised image segmentation has become a hot topic in medical image computin

Healthcare Intelligence Laboratory 1.3k Jan 03, 2023
PyTorch implementation of DirectCLR from paper Understanding Dimensional Collapse in Contrastive Self-supervised Learning

DirectCLR DirectCLR is a simple contrastive learning model for visual representation learning. It does not require a trainable projector as SimCLR. It

Meta Research 49 Dec 21, 2022
Lepard: Learning Partial point cloud matching in Rigid and Deformable scenes

Lepard: Learning Partial point cloud matching in Rigid and Deformable scenes [Paper] Method overview 4DMatch Benchmark 4DMatch is a benchmark for matc

103 Jan 06, 2023
A geometric deep learning pipeline for predicting protein interface contacts.

A geometric deep learning pipeline for predicting protein interface contacts.

44 Dec 30, 2022
Practical Single-Image Super-Resolution Using Look-Up Table

Practical Single-Image Super-Resolution Using Look-Up Table [Paper] Dependency Python 3.6 PyTorch glob numpy pillow tqdm tensorboardx 1. Training deep

Younghyun Jo 116 Dec 23, 2022
Python Auto-ML Package for Tabular Datasets

Tabular-AutoML AutoML Package for tabular datasets Tabular dataset tuning is now hassle free! Run one liner command and get best tuning and processed

Sagnik Roy 18 Nov 20, 2022
Final report with code for KAIST Course KSE 801.

Orthogonal collocation is a method for the numerical solution of partial differential equations

Chuanbo HUA 4 Apr 06, 2022
Tutorials and implementations for "Self-normalizing networks"

Self-Normalizing Networks Tutorials and implementations for "Self-normalizing networks"(SNNs) as suggested by Klambauer et al. (arXiv pre-print). Vers

Institute of Bioinformatics, Johannes Kepler University Linz 1.6k Jan 07, 2023
Examples of using f2py to get high-speed Fortran integrated with Python easily

f2py Examples Simple examples of using f2py to get high-speed Fortran integrated with Python easily. These examples are also useful to troubleshoot pr

Michael 35 Aug 21, 2022
Implement the Pareto Optimizer and pcgrad to make a self-adaptive loss for multi-task

multi-task_losses_optimizer Implement the Pareto Optimizer and pcgrad to make a self-adaptive loss for multi-task 已经实验过了,不会有cuda out of memory情况 ##Par

14 Dec 25, 2022
Revealing and Protecting Labels in Distributed Training

Revealing and Protecting Labels in Distributed Training

Google Interns 0 Nov 09, 2022