EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale

Overview

EgonNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale

Paper: EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale submitted to IEEE Robotics and Automation Letters (RA-L) (ArXiv)

Jacek Komorowski, Monika Wysoczanska, Tomasz Trzcinski

Warsaw University of Technology

What's new

  • [2021-10-24] Evaluation code and pretrained models released.

Our other projects

  • MinkLoc3D: Point Cloud Based Large-Scale Place Recognition (WACV 2021): MinkLoc3D
  • MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition (IJCNN 2021): MinkLoc++
  • Large-Scale Topological Radar Localization Using Learned Descriptors (ICONIP 2021): RadarLoc

Introduction

The paper presents a deep neural network-based method for global and local descriptors extraction from a point cloud acquired by a rotating 3D LiDAR sensor. The descriptors can be used for two-stage 6DoF relocalization. First, a course position is retrieved by finding candidates with the closest global descriptor in the database of geo-tagged point clouds. Then, 6DoF pose between a query point cloud and a database point cloud is estimated by matching local descriptors and using a robust estimator such as RANSAC. Our method has a simple, fully convolutional architecture and uses a sparse voxelized representation of the input point cloud. It can efficiently extract a global descriptor and a set of keypoints with their local descriptors from large point clouds with tens of thousand points.

Citation

If you find this work useful, please consider citing:

Environment and Dependencies

Code was tested using Python 3.8 with PyTorch 1.9.1 and MinkowskiEngine 0.5.4 on Ubuntu 20.04 with CUDA 10.2. Note: CUDA 11.1 is not recommended as there are some issues with MinkowskiEngine 0.5.4 on CUDA 11.1.

The following Python packages are required:

  • PyTorch (version 1.9.1)
  • MinkowskiEngine (version 0.5.4)
  • pytorch_metric_learning (version 0.9.99 or above)
  • wandb

Modify the PYTHONPATH environment variable to include absolute path to the project root folder:

export PYTHONPATH=$PYTHONPATH:/home/.../Egonn

Datasets

EgoNN is trained and evaluated using the following datasets:

  • MulRan dataset: Sejong traversal is used. The traversal is split into training and evaluation part link
  • Apollo-SouthBay dataset: SunnyvaleBigLoop trajectory is used for evaluation, other 5 trajectories (BaylandsToSeafood, ColumbiaPark, Highway237, MathildaAVE, SanJoseDowntown) are used for training link
  • Kitti dataset: Sequence 00 is used for evaluation link

First, you need to download datasets:

  • For MulRan dataset you need to download ground truth data (*.csv) and LiDAR point clouds (Ouster.zip) for traversals: Sejong01 and Sejong02 (link).
  • Download Apollo-SouthBay dataset using the download link on the dataset website (link).
  • Download Kitti odometry dataset (calibration files, ground truth poses, Velodyne laser data) (link).

After loading datasets you need to generate training pickles for the network training and evaluation pickles for model evaluation.

Training pickles generation

Generating training tuples is very time consuming, as ICP is used to refine the ground truth poses between each pair of neighbourhood point clouds.

cd datasets/mulran
python generate_training_tuples.py --dataset_root <mulran_dataset_root_path>

cd ../southbay
python generate_training_tuples.py --dataset_root <apollo_southbay_dataset_root_path>
Evaluation pickles generation
cd datasets/mulran
python generate_evaluation_sets.py --dataset_root <mulran_dataset_root_path>

cd ../southbay
python generate_evaluation_sets.py --dataset_root <apollo_southbay_dataset_root_path>

cd ../kitti
python generate_evaluation_sets.py --dataset_root <kitti_dataset_root_path>

Training (training code will be released after the paper acceptance)

First, download datasets and generate training and evaluation pickles as described above. Edit the configuration file config_egonn.txt. Set dataset_folder parameter to point to the dataset root folder. Modify batch_size_limit and secondary_batch_size_limit parameters depending on available GPU memory. Default limits requires at least 11GB of GPU RAM.

To train the EgoNN model, run:

cd training

python train.py --config ../config/config_egonn.txt --model_config ../models/egonn.txt 

Pre-trained Model

EgoNN model trained (on training splits of MulRan and Apollo-SouthBay datasets) is available in weights/model_egonn_20210916_1104.pth folder.

Evaluation

To evaluate a pretrained model run below commands. Ground truth poses between different traversals in all three datasets are slightly misaligned. To reproduce results from the paper, use --icp_refine option to refine ground truth poses using ICP.

cd eval

# To evaluate on test split of Mulran dataset
python evaluate.py --dataset_root <dataset_root_path> --dataset_type mulran --eval_set test_Sejong01_Sejong02.pickle --model_config ../models/egonn.txt --weights ../weights/model_egonn_20210916_1104.pth --icp_refine

# To evaluate on test split of Apollo-SouthBay dataset
python evaluate.py --dataset_root <dataset_root_path> --dataset_type southbay --eval_set test_SunnyvaleBigloop_1.0_5.pickle --model_config ../models/egonn.txt --weights ../weights/model_egonn_20210916_1104.pth --icp_refine

# To evaluate on test split of KITTI dataset
python evaluate.py --dataset_root <dataset_root_path> --dataset_type kitti --eval_set kitti_00_eval.pickle --model_config ../models/egonn.txt --weights ../weights/model_egonn_20210916_1104.pth --icp_refine

Results

EgoNN performance...

Visualizations

Visualizations of our keypoint detector results. On the left, we show 128 keypoints with the lowest saliency uncertainty (red dots). On the right, 128 keypoints with the highest uncertainty (yellow dots).

Successful registration of point cloud pairs from KITTI dataset gathered during revisiting the same place from different directions. On the left we show keypoint correspondences (RANSAC inliers) found during 6DoF pose estimation with RANSAC. On the right we show point clouds aligned using estimated poses.

License

Our code is released under the MIT License (see LICENSE file for details).

The source code of CVPR17 'Generative Face Completion'.

GenerativeFaceCompletion Matcaffe implementation of our CVPR17 paper on face completion. In each panel from left to right: original face, masked input

Yijun Li 313 Oct 18, 2022
A simple baseline for 3d human pose estimation in tensorflow. Presented at ICCV 17.

3d-pose-baseline This is the code for the paper Julieta Martinez, Rayat Hossain, Javier Romero, James J. Little. A simple yet effective baseline for 3

Julieta Martinez 1.3k Jan 03, 2023
Enhancing Column Generation by a Machine-Learning-BasedPricing Heuristic for Graph Coloring

Enhancing Column Generation by a Machine-Learning-BasedPricing Heuristic for Graph Coloring (to appear at AAAI 2022) We propose a machine-learning-bas

YunzhuangS 2 May 02, 2022
Code for C2-Matching (CVPR2021). Paper: Robust Reference-based Super-Resolution via C2-Matching.

C2-Matching (CVPR2021) This repository contains the implementation of the following paper: Robust Reference-based Super-Resolution via C2-Matching Yum

Yuming Jiang 151 Dec 26, 2022
A repository with exploration into using transformers to predict DNA ↔ transcription factor binding

Transcription Factor binding predictions with Attention and Transformers A repository with exploration into using transformers to predict DNA ↔ transc

Phil Wang 62 Dec 20, 2022
Official implementation of UTNet: A Hybrid Transformer Architecture for Medical Image Segmentation

UTNet (Accepted at MICCAI 2021) Official implementation of UTNet: A Hybrid Transformer Architecture for Medical Image Segmentation Introduction Transf

110 Jan 01, 2023
The official implementation of the CVPR2021 paper: Decoupled Dynamic Filter Networks

Decoupled Dynamic Filter Networks This repo is the official implementation of CVPR2021 paper: "Decoupled Dynamic Filter Networks". Introduction DDF is

F.S.Fire 180 Dec 30, 2022
This is a package for LiDARTag, described in paper: LiDARTag: A Real-Time Fiducial Tag System for Point Clouds

LiDARTag Overview This is a package for LiDARTag, described in paper: LiDARTag: A Real-Time Fiducial Tag System for Point Clouds (PDF)(arXiv). This wo

University of Michigan Dynamic Legged Locomotion Robotics Lab 159 Dec 21, 2022
CodeContests is a competitive programming dataset for machine-learning

CodeContests CodeContests is a competitive programming dataset for machine-learning. This dataset was used when training AlphaCode. It consists of pro

DeepMind 1.6k Jan 08, 2023
ByteTrack超详细教程!训练自己的数据集&&摄像头实时检测跟踪

ByteTrack超详细教程!训练自己的数据集&&摄像头实时检测跟踪

Double-zh 45 Dec 19, 2022
This code provides various models combining dilated convolutions with residual networks

Overview This code provides various models combining dilated convolutions with residual networks. Our models can achieve better performance with less

Fisher Yu 1.1k Dec 30, 2022
Very Deep Convolutional Networks for Large-Scale Image Recognition

pytorch-vgg Some scripts to convert the VGG-16 and VGG-19 models [1] from Caffe to PyTorch. The converted models can be used with the PyTorch model zo

Justin Johnson 217 Dec 05, 2022
offical implement of our Lifelong Person Re-Identification via Adaptive Knowledge Accumulation in CVPR2021

LifelongReID Offical implementation of our Lifelong Person Re-Identification via Adaptive Knowledge Accumulation in CVPR2021 by Nan Pu, Wei Chen, Yu L

PeterPu 76 Dec 08, 2022
Keras-retinanet - Keras implementation of RetinaNet object detection.

Keras RetinaNet Keras implementation of RetinaNet object detection as described in Focal Loss for Dense Object Detection by Tsung-Yi Lin, Priya Goyal,

Fizyr 4.3k Jan 01, 2023
Anonymize BLM Protest Images

Anonymize BLM Protest Images This repository automates @BLMPrivacyBot, a Twitter bot that shows the anonymized images to help keep protesters safe. Us

Stanford Machine Learning Group 40 Oct 13, 2022
Image inpainting using Gaussian Mixture Models

dmfa_inpainting Source code for: MisConv: Convolutional Neural Networks for Missing Data (to be published at WACV 2022) Estimating conditional density

Marcin Przewięźlikowski 8 Oct 09, 2022
DeepCO3: Deep Instance Co-segmentation by Co-peak Search and Co-saliency

[CVPR19] DeepCO3: Deep Instance Co-segmentation by Co-peak Search and Co-saliency (Oral paper) Authors: Kuang-Jui Hsu, Yen-Yu Lin, Yung-Yu Chuang PDF:

Kuang-Jui Hsu 139 Dec 22, 2022
A user-friendly research and development tool built to standardize RL competency assessment for custom agents and environments.

Built with ❤️ by Sam Showalter Contents Overview Installation Dependencies Usage Scripts Standard Execution Environment Development Environment Benchm

SRI-AIC 1 Nov 18, 2021
Official Code Release for "CLIP-Adapter: Better Vision-Language Models with Feature Adapters"

Official Code Release for "CLIP-Adapter: Better Vision-Language Models with Feature Adapters" Pipeline of CLIP-Adapter CLIP-Adapter is a drop-in modul

peng gao 157 Dec 26, 2022
This project uses reinforcement learning on stock market and agent tries to learn trading. The goal is to check if the agent can learn to read tape. The project is dedicated to hero in life great Jesse Livermore.

Reinforcement-trading This project uses Reinforcement learning on stock market and agent tries to learn trading. The goal is to check if the agent can

Deepender Singla 1.4k Dec 22, 2022