SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement

Overview

SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement

This repository implements the approach described in SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement (WACV 2022).

Iterative refinement using SporeAgent

Iterative registration using SporeAgent:
The initial pose from PoseCNN (purple) and the final pose using SporeAgent (blue) on the LINEMOD (left,cropped) and YCB-Video (right) datasets.

Scene-level Plausibility

Scene-level Plausibility:
The initial scene configuration from PoseCNN (left) results in an implausible pose of the target object (gray). Refinement using SporeAgent (right) results in a plausible scene configuration where the intersecting points (red) are resolved and the object rests on its supported points (cyan).

LINEMOD AD < 0.10d AD < 0.05d AD <0.02d YCB-Video ADD AUC AD AUC ADI AUC
PoseCNN 62.7 26.9 3.3 51.5 61.3 75.2
Point-to-Plane ICP 92.6 79.8 29.9 68.2 79.2 87.8
w/ VeREFINE 96.1 85.8 32.5 70.1 81.0 88.8
Multi-hypothesis ICP 99.3 89.9 35.6 77.4 86.6 92.6
SporeAgent 99.3 93.7 50.3 79.0 88.8 93.6

Comparison on LINEMOD and YCB-Video:
The initial pose and segmentation estimates are computed using PoseCNN. We compare our approach to vanilla Point-to-Plane ICP (from Open3D), Point-to-Plane ICP augmented by the simulation-based VeREFINE approach and the ICP-based multi-hypothesis approach used for refinement in PoseCNN.

Dependencies

The code has been tested on Ubuntu 16.04 and 20.04 with Python 3.6 and CUDA 10.2. To set-up the Python environment, use Anaconda and the provided YAML file:

conda env create -f environment.yml --name sporeagent

conda activate sporeagent.

The BOP Toolkit is additionally required. The BOP_PATH in config.py needs to be changed to the respective clone directory and the packages required by the BOP Toolkit need to be installed.

The YCB-Video Toolbox is required for experiments on the YCB-Video dataset.

Datasets

We use the dataset versions prepared for the BOP challenge. The required files can be downloaded to a directory of your choice using the following bash script:

export SRC=http://ptak.felk.cvut.cz/6DB/public/bop_datasets
export DATASET=ycbv                     # either "lm" or "ycbv"
wget $SRC/$DATASET_base.zip             # Base archive with dataset info, camera parameters, etc.
wget $SRC/$DATASET_models.zip           # 3D object models.
wget $SRC/$DATASET_test_all.zip         # All test images.
unzip $DATASET_base.zip                 # Contains folder DATASET.
unzip $DATASET_models.zip -d $DATASET   # Unpacks to DATASET.
unzip $DATASET_test_all.zip -d $DATASET # Unpacks to DATASET.

For training on YCB-Video, the $DATASET_train_real.zip is moreover required.

In addition, we have prepared point clouds sampled within the ground-truth masks (for training) and the segmentation masks computed using PoseCNN (for evaluation) for the LINEMOD and YCB-Video dataset. The samples for evaluation also include the initial pose estimates from PoseCNN.

LINEMOD

Extract the prepared samples into PATH_TO_BOP_LM/sporeagent/ and set LM_PATH in config.py to the base directory, i.e., PATH_TO_BOP_LM. Download the PoseCNN results and the corresponding image set definitions provided with DeepIM and extract both into POSECNN_LM_RESULTS_PATH. Finally, since the BOP challenge uses a different train/test split than the compared methods, the appropriate target file found here needs to be placed in the PATH_TO_BOP_LM directory.

To compute the AD scores using the BOP Toolkit, BOP_PATH/scripts/eval_bop19.py needs to be adapted:

  • to use ADI for symmetric objects and ADD otherwise with a 2/5/10% threshold, change p['errors'] to
{
  'n_top': -1,
  'type': 'ad',
  'correct_th': [[0.02], [0.05], [0.1]]
}
  • to use the correct test targets, change p['targets_filename'] to 'test_targets_add.json'

YCB-Video

Extract the prepared samples into PATH_TO_BOP_YCBV/reagent/ and set YCBV_PATH in config.py to the base directory, i.e., PATH_TO_BOP_YCBV. Clone the YCB Video Toolbox to POSECNN_YCBV_RESULTS_PATH. Extract the results_PoseCNN_RSS2018.zip and copy test_data_list.txt to the same directory. The POSECNN_YCBV_RESULTS_PATH in config.py needs to be changed to the respective directory. Additionally, place the meshes in the canonical frame models_eval_canonical in the PATH_TO_BOP_YCBV directory.

To compute the ADD/AD/ADI AUC scores using the YCB-Video Toolbox, replace the respective files in the toolbox by the ones provided in sporeagent/ycbv_toolbox.

Pretrained models

Weights for both datasets can be found here. Download and copy them to sporeagent/weights/.

Training

For LINEMOD: python registration/train.py --dataset=lm

For YCB-Video: python registration/train.py --dataset=ycbv

Evaluation

Note that we precompute the normal images used for pose scoring on the first run and store them to disk.

LINEMOD

The results for LINEMOD are computed using the BOP Toolkit. The evaluation script exports the required file by running

python registration/eval.py --dataset=lm,

which can then be processed via

python BOP_PATH/scripts/eval_bop19.py --result_filenames=PATH_TO_CSV_WITH_RESULTS.

YCB-Video

The results for YCB-Video are computed using the YCB-Video Toolbox. The evaluation script exports the results in BOP format by running

python registration/eval.py --dataset=ycbv,

which can then be parsed into the format used by the YCB-Video Toolbox by running

python utility/parse_matlab.py.

In MATLAB, run evaluate_poses_keyframe.m to generate the per-sample results and plot_accuracy_keyframe.m to compute the statistics.

Citation

If you use this repository in your publications, please cite

@article{bauer2022sporeagent,
    title={SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement},
    author={Bauer, Dominik and Patten, Timothy and Vincze, Markus},
    booktitle={IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    year={2022},
    pages={654-662}
}
Owner
Dominik Bauer
Dominik Bauer
The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track.

ISC21-Descriptor-Track-1st The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track. You can check our solution

lyakaap 73 Dec 24, 2022
Faster Convex Lipschitz Regression

Faster Convex Lipschitz Regression This reepository provides a python implementation of our Faster Convex Lipschitz Regression algorithm with GPU and

Ali Siahkamari 0 Nov 19, 2021
PyTorch code for training MM-DistillNet for multimodal knowledge distillation

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge MM-DistillNet is a

51 Dec 20, 2022
Repository containing the PhD Thesis "Formal Verification of Deep Reinforcement Learning Agents"

Getting Started This repository contains the code used for the following publications: Probabilistic Guarantees for Safe Deep Reinforcement Learning (

Edoardo Bacci 5 Aug 31, 2022
NanoDet-Plus⚡Super fast and lightweight anchor-free object detection model. 🔥Only 980 KB(int8) / 1.8MB (fp16) and run 97FPS on cellphone🔥

NanoDet-Plus⚡Super fast and lightweight anchor-free object detection model. 🔥Only 980 KB(int8) / 1.8MB (fp16) and run 97FPS on cellphone🔥

4.8k Jan 07, 2023
ALFRED - A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

ALFRED A Benchmark for Interpreting Grounded Instructions for Everyday Tasks Mohit Shridhar, Jesse Thomason, Daniel Gordon, Yonatan Bisk, Winson Han,

ALFRED 204 Dec 15, 2022
HDMapNet: A Local Semantic Map Learning and Evaluation Framework

HDMapNet_devkit Devkit for HDMapNet. HDMapNet: A Local Semantic Map Learning and Evaluation Framework Qi Li, Yue Wang, Yilun Wang, Hang Zhao [Paper] [

Tsinghua MARS Lab 421 Jan 04, 2023
Computing Shapley values using VAEAC

Shapley values and the VAEAC method In this GitHub repository, we present the implementation of the VAEAC approach from our paper "Using Shapley Value

3 Nov 23, 2022
Code for the ICCV2021 paper "Personalized Image Semantic Segmentation"

PSS: Personalized Image Semantic Segmentation Paper PSS: Personalized Image Semantic Segmentation Yu Zhang, Chang-Bin Zhang, Peng-Tao Jiang, Ming-Ming

张宇 15 Jul 09, 2022
X-VLM: Multi-Grained Vision Language Pre-Training

X-VLM: learning multi-grained vision language alignments Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts. Yan Zeng, Xi

Yan Zeng 286 Dec 23, 2022
Tackling the Class Imbalance Problem of Deep Learning Based Head and Neck Organ Segmentation

Info This is the code repository of the work Tackling the Class Imbalance Problem of Deep Learning Based Head and Neck Organ Segmentation from Elias T

2 Apr 20, 2022
LSTM-VAE Implementation and Relevant Evaluations

LSTM-VAE Implementation and Relevant Evaluations Before using any file in this repository, please create two directories under the root directory name

Lan Zhang 5 Oct 08, 2022
[NeurIPS 2021] "G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators"

G-PATE This is the official code base for our NeurIPS 2021 paper: "G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of T

AI Secure 14 Oct 12, 2022
Deep-learning X-Ray Micro-CT image enhancement, pore-network modelling and continuum modelling

EDSR modelling A Github repository for deep-learning image enhancement, pore-network and continuum modelling from X-Ray Micro-CT images. The repositor

Samuel Jackson 7 Nov 03, 2022
A simple tutoral for error correction task, based on Pytorch

gramcorrector A simple tutoral for error correction task, based on Pytorch Grammatical Error Detection (sentence-level) a binary sequence-based classi

peiyuan_gong 8 Dec 03, 2022
A simple, unofficial implementation of MAE using pytorch-lightning

Masked Autoencoders in PyTorch A simple, unofficial implementation of MAE (Masked Autoencoders are Scalable Vision Learners) using pytorch-lightning.

Connor Anderson 20 Dec 03, 2022
Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"

GradTTS Unofficial Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech" (arxiv) About this repo This is an unoffic

HeyangXue1997 103 Dec 23, 2022
Automatic voice-synthetised summaries of latest research papers on arXiv

PaperWhisperer PaperWhisperer is a Python application that keeps you up-to-date with research papers. How? It retrieves the latest articles from arXiv

Valerio Velardo 124 Dec 20, 2022
TakeInfoatNistforICS - Take Information in NIST NVD for ICS

Take Information in NIST NVD for ICS This project developed with Python. When yo

5 Sep 05, 2022
시각 장애인을 위한 스마트 지팡이에 활용될 딥러닝 모델 (DL Model Repo)

SmartCane-DL-Model Smart Cane using semantic segmentation 참고한 Github repositoy 🔗 https://github.com/JunHyeok96/Road-Segmentation.git 데이터셋 🔗 https://

반드시 졸업한다 (Team Just Graduate) 4 Dec 03, 2021