SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement

Overview

SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement

This repository implements the approach described in SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement (WACV 2022).

Iterative refinement using SporeAgent

Iterative registration using SporeAgent:
The initial pose from PoseCNN (purple) and the final pose using SporeAgent (blue) on the LINEMOD (left,cropped) and YCB-Video (right) datasets.

Scene-level Plausibility

Scene-level Plausibility:
The initial scene configuration from PoseCNN (left) results in an implausible pose of the target object (gray). Refinement using SporeAgent (right) results in a plausible scene configuration where the intersecting points (red) are resolved and the object rests on its supported points (cyan).

LINEMOD AD < 0.10d AD < 0.05d AD <0.02d YCB-Video ADD AUC AD AUC ADI AUC
PoseCNN 62.7 26.9 3.3 51.5 61.3 75.2
Point-to-Plane ICP 92.6 79.8 29.9 68.2 79.2 87.8
w/ VeREFINE 96.1 85.8 32.5 70.1 81.0 88.8
Multi-hypothesis ICP 99.3 89.9 35.6 77.4 86.6 92.6
SporeAgent 99.3 93.7 50.3 79.0 88.8 93.6

Comparison on LINEMOD and YCB-Video:
The initial pose and segmentation estimates are computed using PoseCNN. We compare our approach to vanilla Point-to-Plane ICP (from Open3D), Point-to-Plane ICP augmented by the simulation-based VeREFINE approach and the ICP-based multi-hypothesis approach used for refinement in PoseCNN.

Dependencies

The code has been tested on Ubuntu 16.04 and 20.04 with Python 3.6 and CUDA 10.2. To set-up the Python environment, use Anaconda and the provided YAML file:

conda env create -f environment.yml --name sporeagent

conda activate sporeagent.

The BOP Toolkit is additionally required. The BOP_PATH in config.py needs to be changed to the respective clone directory and the packages required by the BOP Toolkit need to be installed.

The YCB-Video Toolbox is required for experiments on the YCB-Video dataset.

Datasets

We use the dataset versions prepared for the BOP challenge. The required files can be downloaded to a directory of your choice using the following bash script:

export SRC=http://ptak.felk.cvut.cz/6DB/public/bop_datasets
export DATASET=ycbv                     # either "lm" or "ycbv"
wget $SRC/$DATASET_base.zip             # Base archive with dataset info, camera parameters, etc.
wget $SRC/$DATASET_models.zip           # 3D object models.
wget $SRC/$DATASET_test_all.zip         # All test images.
unzip $DATASET_base.zip                 # Contains folder DATASET.
unzip $DATASET_models.zip -d $DATASET   # Unpacks to DATASET.
unzip $DATASET_test_all.zip -d $DATASET # Unpacks to DATASET.

For training on YCB-Video, the $DATASET_train_real.zip is moreover required.

In addition, we have prepared point clouds sampled within the ground-truth masks (for training) and the segmentation masks computed using PoseCNN (for evaluation) for the LINEMOD and YCB-Video dataset. The samples for evaluation also include the initial pose estimates from PoseCNN.

LINEMOD

Extract the prepared samples into PATH_TO_BOP_LM/sporeagent/ and set LM_PATH in config.py to the base directory, i.e., PATH_TO_BOP_LM. Download the PoseCNN results and the corresponding image set definitions provided with DeepIM and extract both into POSECNN_LM_RESULTS_PATH. Finally, since the BOP challenge uses a different train/test split than the compared methods, the appropriate target file found here needs to be placed in the PATH_TO_BOP_LM directory.

To compute the AD scores using the BOP Toolkit, BOP_PATH/scripts/eval_bop19.py needs to be adapted:

  • to use ADI for symmetric objects and ADD otherwise with a 2/5/10% threshold, change p['errors'] to
{
  'n_top': -1,
  'type': 'ad',
  'correct_th': [[0.02], [0.05], [0.1]]
}
  • to use the correct test targets, change p['targets_filename'] to 'test_targets_add.json'

YCB-Video

Extract the prepared samples into PATH_TO_BOP_YCBV/reagent/ and set YCBV_PATH in config.py to the base directory, i.e., PATH_TO_BOP_YCBV. Clone the YCB Video Toolbox to POSECNN_YCBV_RESULTS_PATH. Extract the results_PoseCNN_RSS2018.zip and copy test_data_list.txt to the same directory. The POSECNN_YCBV_RESULTS_PATH in config.py needs to be changed to the respective directory. Additionally, place the meshes in the canonical frame models_eval_canonical in the PATH_TO_BOP_YCBV directory.

To compute the ADD/AD/ADI AUC scores using the YCB-Video Toolbox, replace the respective files in the toolbox by the ones provided in sporeagent/ycbv_toolbox.

Pretrained models

Weights for both datasets can be found here. Download and copy them to sporeagent/weights/.

Training

For LINEMOD: python registration/train.py --dataset=lm

For YCB-Video: python registration/train.py --dataset=ycbv

Evaluation

Note that we precompute the normal images used for pose scoring on the first run and store them to disk.

LINEMOD

The results for LINEMOD are computed using the BOP Toolkit. The evaluation script exports the required file by running

python registration/eval.py --dataset=lm,

which can then be processed via

python BOP_PATH/scripts/eval_bop19.py --result_filenames=PATH_TO_CSV_WITH_RESULTS.

YCB-Video

The results for YCB-Video are computed using the YCB-Video Toolbox. The evaluation script exports the results in BOP format by running

python registration/eval.py --dataset=ycbv,

which can then be parsed into the format used by the YCB-Video Toolbox by running

python utility/parse_matlab.py.

In MATLAB, run evaluate_poses_keyframe.m to generate the per-sample results and plot_accuracy_keyframe.m to compute the statistics.

Citation

If you use this repository in your publications, please cite

@article{bauer2022sporeagent,
    title={SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement},
    author={Bauer, Dominik and Patten, Timothy and Vincze, Markus},
    booktitle={IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    year={2022},
    pages={654-662}
}
Owner
Dominik Bauer
Dominik Bauer
Implementations for the ICLR-2021 paper: SEED: Self-supervised Distillation For Visual Representation.

Implementations for the ICLR-2021 paper: SEED: Self-supervised Distillation For Visual Representation.

Jacob 27 Oct 23, 2022
Analysis of Smiles through reservoir sampling & RDkit

Analysis of Smiles through reservoir sampling and machine learning (under development). This is a simple project that includes two Jupyter files for t

Aurimas A. Nausėdas 6 Aug 30, 2022
Official PaddlePaddle implementation of Paint Transformer

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction [Paper] [Paddle Implementation] Update We have optimized the serial inference p

TianweiLin 284 Dec 31, 2022
LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation (NeurIPS2021 Benchmark and Dataset Track)

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation by Junjue Wang, Zhuo Zheng, Ailong Ma, Xiaoyan Lu, and Yanfei Zh

Kingdrone 174 Dec 22, 2022
Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation (CVPR 2021)

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation Input Image Initial CAM Successive Maps with adversar

Jungbeom Lee 110 Dec 07, 2022
This is the repository for The Machine Learning Workshops, published by AI DOJO

This is the repository for The Machine Learning Workshops, published by AI DOJO. It contains all the workshop's code with supporting project files necessary to work through the code.

AI Dojo 12 May 06, 2022
Retinal vessel segmentation based on GT-UNet

Retinal vessel segmentation based on GT-UNet Introduction This project is a retinal blood vessel segmentation code based on UNet-like Group Transforme

Kent0n 27 Dec 18, 2022
Instance-level Image Retrieval using Reranking Transformers

Instance-level Image Retrieval using Reranking Transformers Fuwen Tan, Jiangbo Yuan, Vicente Ordonez, ICCV 2021. Abstract Instance-level image retriev

UVA Computer Vision 87 Jan 03, 2023
This project uses Template Matching technique for object detecting by detection of template image over base image.

Object Detection Project Using OpenCV This project uses Template Matching technique for object detecting by detection the template image over base ima

Pratham Bhatnagar 7 May 29, 2022
Supplementary code for the paper "Meta-Solver for Neural Ordinary Differential Equations" https://arxiv.org/abs/2103.08561

Meta-Solver for Neural Ordinary Differential Equations Towards robust neural ODEs using parametrized solvers. Main idea Each Runge-Kutta (RK) solver w

Julia Gusak 25 Aug 12, 2021
Temporally Coherent GAN SIGGRAPH project.

TecoGAN This repository contains source code and materials for the TecoGAN project, i.e. code for a TEmporally COherent GAN for video super-resolution

Duc Linh Nguyen 2 Jan 18, 2022
A tool for calculating distortion parameters in coordination complexes.

OctaDist Octahedral distortion calculator: A tool for calculating distortion parameters in coordination complexes. https://octadist.github.io/ Registe

OctaDist 12 Oct 04, 2022
On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation

On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation On Nonlinear Latent Transformations for GAN-based Image Editi

Valentin Khrulkov 22 Oct 24, 2022
Build tensorflow keras model pipelines in a single line of code. Created by Ram Seshadri. Collaborators welcome. Permission granted upon request.

deep_autoviml Build keras pipelines and models in a single line of code! Table of Contents Motivation How it works Technology Install Usage API Image

AutoViz and Auto_ViML 102 Dec 17, 2022
Facestar dataset. High quality audio-visual recordings of human conversational speech.

Facestar Dataset Description Existing audio-visual datasets for human speech are either captured in a clean, controlled environment but contain only a

Meta Research 87 Dec 21, 2022
Surrogate- and Invariance-Boosted Contrastive Learning (SIB-CL)

Surrogate- and Invariance-Boosted Contrastive Learning (SIB-CL) This repository contains all source code used to generate the results in the article "

Charlotte Loh 3 Jul 23, 2022
Let's Git - Versionsverwaltung & Open Source Hausaufgabe

Let's Git - Versionsverwaltung & Open Source Hausaufgabe Herzlich Willkommen zu dieser Hausaufgabe für unseren MOOC: Let's Git! Wir hoffen, dass Du vi

1 Dec 13, 2021
Employee-Managment - Company employee registration software in the face recognition system

Employee-Managment Company employee registration software in the face recognitio

Alireza Kiaeipour 7 Jul 10, 2022
Bi-level feature alignment for versatile image translation and manipulation (Under submission of TPAMI)

Bi-level feature alignment for versatile image translation and manipulation (Under submission of TPAMI) Preparation Clone the Synchronized-BatchNorm-P

Fangneng Zhan 12 Aug 10, 2022
A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

張致強 14 Dec 02, 2022