Code for CVPR 2022 paper "SoftGroup for Instance Segmentation on 3D Point Clouds"

Overview

SoftGroup

PWC PWC Architecture

We provide code for reproducing results of the paper SoftGroup for 3D Instance Segmentation on Point Clouds (CVPR 2022)

Author: Thang Vu, Kookhoi Kim, Tung M. Luu, Xuan Thanh Nguyen, and Chang D. Yoo.

Introduction

Existing state-of-the-art 3D instance segmentation methods perform semantic segmentation followed by grouping. The hard predictions are made when performing semantic segmentation such that each point is associated with a single class. However, the errors stemming from hard decision propagate into grouping that results in (1) low overlaps between the predicted instance with the ground truth and (2) substantial false positives. To address the aforementioned problems, this paper proposes a 3D instance segmentation method referred to as SoftGroup by performing bottom-up soft grouping followed by top-down refinement. SoftGroup allows each point to be associated with multiple classes to mitigate the problems stemming from semantic prediction errors and suppresses false positive instances by learning to categorize them as background. Experimental results on different datasets and multiple evaluation metrics demonstrate the efficacy of SoftGroup. Its performance surpasses the strongest prior method by a significant margin of +6.2% on the ScanNet v2 hidden test set and +6.8% on S3DIS Area 5 of AP_50.

Learderboard

Feature

  • State of the art performance on the ScanNet benchmark and S3DIS dataset (3/Mar/2022).
  • High speed of 345 ms per scan on ScanNet dataset, which is comparable with the existing fastest methods (HAIS).
  • Reproducibility code for both ScanNet and S3DIS datasets.

Installation

Please refer to installation guide.

Data Preparation

Please refer to data preparation for preparing the S3DIS and ScanNet v2 dataset.

Pretrained models

Dataset AP AP_50 AP_25 Download
S3DIS 51.4 66.5 75.4 model
ScanNet v2 46.0 67.6 78.9 model

Training

We use the checkpoint of HAIS as pretrained backbone. Download the pretrained HAIS model at here at put it in SoftGroup/ directory.

Training S3DIS dataset

First, finetune the pretrained HAIS point-wise prediction network (backbone) on S3DIS.

python train.py --config config/softgroup_fold5_backbone_s3dis.yaml

Then, train model from frozen backbone.

python train.py --config config/softgroup_fold5_default_s3dis.yaml

Training ScanNet V2 dataset

Training on ScanNet doesnot require finetuning the backbone. Just freeze pretrained backbone and train the model.

python train.py --config config/softgroup_default_scannet.yaml

Inference

Testing for S3DIS dataset.

CUDA_VISIBLE_DEVICES=0 python test_s3dis.py --config config/softgroup_fold5_phase2_s3dis.yaml --pretrain $PATH_TO_PRETRAIN_MODEL$

Testing for ScanNet V2 dataset.

CUDA_VISIBLE_DEVICES=0 python test.py --config config/softgroup_default_scannet.yaml --pretrain $PATH_TO_PRETRAIN_MODEL$

Visualization

We provide visualization tools based on Open3D (tested on Open3D 0.8.0).

pip install open3D==0.8.0
python visualize_open3d.py --data_path {} --prediction_path {} --data_split {} --room_name {} --task {}

Please refer to visualize_open3d.py for more details.

Citation

If you find our work helpful for your research. Please consider citing our paper.

@inproceedings{vu2022softgroup,
  title={SoftGroup for 3D Instance Segmentation on 3D Point Clouds},
  author={Vu, Thang and Kim, Kookhoi and Luu, Tung M. and Nguyen, Xuan Thanh and Yoo, Chang D.},
  booktitle={CVPR},
  year={2022}
}
Owner
Thang Vu
My research involves in Deep Learning for Computer Vision (image enhancement, object detection, segmentation) and other AI related fields.
Thang Vu
基于openpose和图像分类的手语识别项目

手语识别 0、使用到的模型 (1). openpose,作者:CMU-Perceptual-Computing-Lab https://github.com/CMU-Perceptual-Computing-Lab/openpose (2). 图像分类classification,作者:Bubbl

20 Dec 15, 2022
An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss This is an unofficial implementation of AutoVC based on the official one. The reposi

Chien-yu Huang 27 Jun 16, 2022
【Auto】原神⭐钓鱼辅助工具 | 自动收竿、校准游标 | ✨您只需要抛出鱼竿,我们会帮你完成一切✨

原神钓鱼辅助工具 ✨ 作者正在努力重构代码中……会尽快带给大家一个更完美的脚本 ✨ 「您只需抛出鱼竿,然后我们会帮您搞定一切」 如果你觉得这个脚本好用,请点一个 Star ⭐ ,你的 Star 就是作者更新最大的动力 点击这里 查看演示视频 ✨ 欢迎大家在 Issues 中分享自己的配置文件 ✨ ✨

261 Jan 02, 2023
Image processing using OpenCv

Image processing using OpenCv Write a program that opens the webcam, and the user selects one of the following on the video: ✅ If the user presses the

M.Najafi 4 Feb 18, 2022
Tesseract Open Source OCR Engine (main repository)

Tesseract OCR About This package contains an OCR engine - libtesseract and a command line program - tesseract. Tesseract 4 adds a new neural net (LSTM

48.4k Jan 09, 2023
🔎 Like Chardet. 🚀 Package for encoding & language detection. Charset detection.

Charset Detection, for Everyone 👋 The Real First Universal Charset Detector A library that helps you read text from an unknown charset encoding. Moti

TAHRI Ahmed R. 332 Dec 31, 2022
Image processing in Python

scikit-image: Image processing in Python Website (including documentation): https://scikit-image.org/ Mailing list: https://mail.python.org/mailman3/l

Image Processing Toolbox for SciPy 5.2k Dec 30, 2022
LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

Murtaza Hassan 815 Dec 29, 2022
Python package for handwriting and sketching in Jupyter cells

ipysketch A Python package for handwriting and sketching in Jupyter notebooks. Usage A movie is worth a thousand pictures is worth a million words...

Matthias Baer 16 Jan 05, 2023
Text recognition (optical character recognition) with deep learning methods.

What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis | paper | training and evaluation data | failure cases and cle

Clova AI Research 3.2k Jan 04, 2023
An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Tips: A more recent scene text detection algorithm: PixelLink, has been implemented here: https://github.com/ZJULearning/pixel_link Contents: Introduc

dengdan 484 Dec 07, 2022
SRA's seminar on Introduction to Computer Vision Fundamentals

Introduction to Computer Vision This repository includes basics to : Python Numpy: A python library Git Computer Vision. The aim of this repository is

Society of Robotics and Automation 147 Dec 04, 2022
Generating .npy dataset and labels out of given image, containing numbers from 0 to 9, using opencv

basic-dataset-generator-from-image-of-numbers generating .npy dataset and labels out of given image, containing numbers from 0 to 9, using opencv inpu

1 Jan 01, 2022
Volume Control using OpenCV

Gesture-Volume-Control Volume Control using OpenCV Here i made volume control using Python and OpenCV in which we can control the volume of our laptop

Mudit Sinha 3 Oct 10, 2021
Image Smoothing and Blurring Using OpenCV

Image-Smoothing-and-Blurring-Using-OpenCV This repository contains codes for performing image smoothing and blurring using OpenCV. There are different

Happy N. Monday 3 Feb 15, 2022
The official code for the ICCV-2021 paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates".

SpeechDrivesTemplates The official repo for the ICCV-2021 paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates". [arxiv

Qian Shenhan 53 Dec 23, 2022
Text language identification using Wikipedia data

Text language identification using Wikipedia data The aim of this project is to provide high-quality language detection over all the web's languages.

Vsevolod Dyomkin 28 Jul 09, 2022
Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

Ramana Subramanyam 76 Dec 06, 2022
This repository contains the code for the paper "SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks"

SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks (CVPR 2021 Oral) This repository contains the official PyTorch implementation

Shunsuke Saito 235 Dec 18, 2022
This is an API written in python that uses FastAPI. It is a simple API that can detect discord tokens in Images.

Welcome This is an API written in python that uses FastAPI. It is a simple API that can detect discord tokens in Images. Installation There are curren

8 Jul 29, 2022