Perception-aware multi-sensor fusion for 3D LiDAR semantic segmentation (ICCV 2021)

Related tags

Deep LearningPMF
Overview

Perception-Aware Multi-Sensor Fusion for 3D LiDAR Semantic Segmentation (ICCV 2021)

[中文|EN]

概述

本工作主要探索一种高效的多传感器(激光雷达和摄像头)融合点云语义分割方法。现有的多传感器融合方法主要将点云投影到图像上,获取对应的像素位置之后,将对应位置的图像信息投影回点云空间进行特征融合。但是,这种方式下并不能很好的利用图像丰富的视觉感知特征(例如形状、纹理等)。因此,我们尝试探索一种在RGB图像空间进行特征融合的方式,提出了一个基于视觉感知的多传感器融合方法(PMF)。详细内容可以查看我们的公开论文。

image-20211013141408045

主要实验结果

PWC

Leader board of [email protected]

image-20211013144333265

更多实验结果

我们在持续探索PMF框架的潜力,包括探索更大的模型、更好的ImageNet预训练模型、其他的数据集等。我们的实验结果证明了,PMF框架是易于拓展的,并且其性能可以通过使用更好的主干网络而实现提升。详细的说明可以查看文件

方法 数据集 mIoU (%)
PMF-ResNet34 SemanticKITTI Validation Set 63.9
PMF-ResNet34 nuScenes Validation Set 76.9
PMF-ResNet50 nuScenes Validation Set 79.4
PMF48-ResNet101 SensatUrban Test Set (ICCV2021 Competition) 66.2 (排名 5)

使用说明

注:代码中涉及到包括数据集在内的各种路径配置,请根据自己的实际路径进行修改

代码结构

|--- pc_processor/ 点云处理的Python包
	|--- checkpoint/ 生成实验结果目录
	|--- dataset/ 数据集处理
	|--- layers/ 常用网络层
	|--- loss/ 损失函数
	|--- metrices/ 模型性能指标函数
	|--- models/ 网络模型
	|--- postproc/ 后处理,主要是KNN
	|--- utils/ 其他函数
|--- tasks/ 实验任务
	|--- pmf/ PMF 训练源代码
	|--- pmf_eval_nuscenes/ PMF 模型在nuScenes评估代码
		|--- testset_eval/ 合并PMF以及salsanext结果并在nuScenes测试集上评估
		|--- xxx.py PMF 模型在nuScenes评估代码
	|--- pmf_eval_semantickitti/ PMF 在SemanticKITTI valset上评估代码
	|--- salsanext/ SalsaNext 训练代码,基于官方公开代码进行修改
	|--- salsanext_eval_nuscenes/ SalsaNext 在nuScenes 数据集上评估代码

模型训练

训练任务代码目录结构

|--- pmf/
	|--- config_server_kitti.yaml SemanticKITTI数据集训练的配置脚本
	|--- config_server_nus.yaml nuScenes数据集训练的配置脚本
	|--- main.py 主函数
	|--- trainer.py 训练代码
	|--- option.py 配置解析代码
	|--- run.sh 执行脚本,需要 chmod+x 赋予可执行权限

步骤

  1. 进入 tasks/pmf目录,修改配置文件 config_server_kitti.yaml中数据集路径 data_root 为实际数据集路径。如果有需要可以修改gpubatch_size等参数
  2. 修改 run.sh 确保 nproc_per_node 的数值与yaml文件中配置的gpu数量一致
  3. 运行如下指令执行训练脚本
./run.sh
# 或者 bash run.sh
  1. 执行成功之后会在 PMF/experiments/PMF-SemanticKitti路径下自动生成实验日志文件,目录结构如下:
|--- log_dataset_network_xxxx/
	|--- checkpoint/ 训练断点文件以及最佳模型参数
	|--- code/ 代码备份
	|--- log/ 控制台输出日志以及配置文件副本
	|--- events.out.tfevents.xxx tensorboard文件

控制台输出内容如下,其中最后的输出时间为实验预估时间

image-20211013152939956

模型推理

模型推理代码目录结构

|--- pmf_eval_semantickitti/ SemanticKITTI评估代码
	|--- config_server_kitti.yaml 配置脚本
	|--- infer.py 推理脚本
	|--- option.py 配置解析脚本

步骤

  1. 进入 tasks/pmf_eval_semantickitti目录,修改配置文件 config_server_kitti.yaml中数据集路径 data_root 为实际数据集路径。修改pretrained_path指向训练生成的日志文件夹目录。
  2. 运行如下命令执行脚本
python infer.py config_server_kitti.yaml
  1. 运行成功之后,会在训练模型所在目录下生成评估结果日志文件,文件夹目录结构如下:
|--- PMF/experiments/PMF-SemanticKitti/log_xxxx/ 训练结果路径
	|--- Eval_xxxxx/ 评估结果路径
		|--- code/ 代码备份
		|--- log/ 控制台日志文件
		|--- pred/ 用于提交评估的文件

引用

@InProceedings{Zhuang_2021_ICCV,
    author    = {Zhuang, Zhuangwei and Li, Rong and Jia, Kui and Wang, Qicheng and Li, Yuanqing and Tan, Mingkui},
    title     = {Perception-Aware Multi-Sensor Fusion for 3D LiDAR Semantic Segmentation},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {16280-16290}
}
Owner
ICE
Model compression; Object detection; Point cloud processing;
ICE
Repo for CVPR2021 paper "QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information"

QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information by Masato Tamura, Hiroki Ohashi, and Tomoaki Yosh

105 Dec 23, 2022
Hitters Linear Regression - Hitters Linear Regression With Python

Hitters_Linear_Regression Kullanacağımız veri seti Carnegie Mellon Üniversitesi'

AyseBuyukcelik 2 Jan 26, 2022
Range Image-based LiDAR Localization for Autonomous Vehicles Using Mesh Maps

Range Image-based 3D LiDAR Localization This repo contains the code for our ICRA2021 paper: Range Image-based LiDAR Localization for Autonomous Vehicl

Photogrammetry & Robotics Bonn 208 Dec 15, 2022
Creating Multi Task Models With Keras

Creating Multi Task Models With Keras About The Project! I used the keras and Tensorflow Library, To build a Deep Learning Neural Network to Creating

Srajan Chourasia 4 Nov 28, 2022
Data labels and scripts for fastMRI.org

fastMRI+: Clinical pathology annotations for the fastMRI dataset The fastMRI dataset is a publicly available MRI raw (k-space) dataset. It has been us

Microsoft 51 Dec 22, 2022
competitions-v2

Codabench (formerly Codalab Competitions v2) Installation $ cp .env_sample .env $ docker-compose up -d $ docker-compose exec django ./manage.py migrat

CodaLab 21 Dec 02, 2022
商品推荐系统

商品top50推荐系统 问题建模 本项目的数据集给出了15万左右的用户以及12万左右的商品, 以及对应的经过脱敏处理的用户特征和经过预处理的商品特征,旨在为用户推荐50个其可能购买的商品。 推荐系统架构方案 本项目采用传统的召回+排序的方案。

107 Dec 29, 2022
Official PyTorch implementation of UACANet: Uncertainty Aware Context Attention for Polyp Segmentation

UACANet: Uncertainty Aware Context Attention for Polyp Segmentation Official pytorch implementation of UACANet: Uncertainty Aware Context Attention fo

Taehun Kim 85 Dec 14, 2022
Retinal Vessel Segmentation with Pixel-wise Adaptive Filters (ISBI 2022)

Retinal Vessel Segmentation with Pixel-wise Adaptive Filters (ISBI 2022) Introdu

anonymous 14 Oct 27, 2022
Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Music Source Separation with Channel-wise Subband Phase Aware ResUnet (CWS-PResUNet) Introduction This repo contains the pretrained Music Source Separ

Lau 100 Dec 25, 2022
docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Mindee 1.5k Jan 01, 2023
Optimal Camera Position for a Practical Application of Gaze Estimation on Edge Devices,

Optimal Camera Position for a Practical Application of Gaze Estimation on Edge Devices, Linh Van Ma, Tin Trung Tran, Moongu Jeon, ICAIIC 2022 (The 4th

Linh 11 Oct 10, 2022
Official code of paper "PGT: A Progressive Method for Training Models on Long Videos" on CVPR2021

PGT Code for paper PGT: A Progressive Method for Training Models on Long Videos. Install Run pip install -r requirements.txt. Run python setup.py buil

Bo Pang 27 Mar 30, 2022
Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021

Supporting Clustering with Contrastive Learning SCCL (NAACL 2021) Dejiao Zhang, Feng Nan, Xiaokai Wei, Shangwen Li, Henghui Zhu, Kathleen McKeown, Ram

231 Jan 05, 2023
A neuroanatomy-based augmented reality experience powered by computer vision. Features 3D visuals of the Atlas Brain Map slices.

Brain Augmented Reality (AR) A neuroanatomy-based augmented reality experience powered by computer vision that features 3D visuals of the Atlas Brain

Yasmeen Brain 10 Oct 06, 2022
Hydra: an Extensible Fuzzing Framework for Finding Semantic Bugs in File Systems

Hydra: An Extensible Fuzzing Framework for Finding Semantic Bugs in File Systems Paper Finding Semantic Bugs in File Systems with an Extensible Fuzzin

gts3.org (<a href=[email protected])"> 129 Dec 15, 2022
a reimplementation of Optical Flow Estimation using a Spatial Pyramid Network in PyTorch

pytorch-spynet This is a personal reimplementation of SPyNet [1] using PyTorch. Should you be making use of this work, please cite the paper according

Simon Niklaus 269 Jan 02, 2023
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Ro

Meta Research 1.2k Jan 02, 2023
2021 CCF BDCI 全国信息检索挑战杯(CCIR-Cup)智能人机交互自然语言理解赛道第二名参赛解决方案

2021 CCF BDCI 全国信息检索挑战杯(CCIR-Cup) 智能人机交互自然语言理解赛道第二名解决方案 比赛网址: CCIR-Cup-智能人机交互自然语言理解 1.依赖环境: python==3.8 torch==1.7.1+cu110 numpy==1.19.2 transformers=

JinXiang 22 Oct 29, 2022
Generating Anime Images by Implementing Deep Convolutional Generative Adversarial Networks paper

AnimeGAN - Deep Convolutional Generative Adverserial Network PyTorch implementation of DCGAN introduced in the paper: Unsupervised Representation Lear

Rohit Kukreja 23 Jul 21, 2022