Official Pytorch implementation of RePOSE (ICCV2021)

Related tags

Deep LearningRePOSE
Overview

RePOSE: Iterative Rendering and Refinement for 6D Object Detection (ICCV2021) [Link]

overview

Abstract

We present RePOSE, a fast iterative refinement method for 6D object pose estimation. Prior methods perform refinement by feeding zoomed-in input and rendered RGB images into a CNN and directly regressing an update of a refined pose. Their runtime is slow due to the computational cost of CNN, which is especially prominent in multiple-object pose refinement. To overcome this problem, RePOSE leverages image rendering for fast feature extraction using a 3D model with a learnable texture. We call this deep texture rendering, which uses a shallow multi-layer perceptron to directly regress a view-invariant image representation of an object. Furthermore, we utilize differentiable Levenberg-Marquardt (LM) optimization to refine a pose fast and accurately by minimizing the feature-metric error between the input and rendered image representations without the need of zooming in. These image representations are trained such that differentiable LM optimization converges within few iterations. Consequently, RePOSE runs at 92 FPS and achieves state-of-the-art accuracy of 51.6% on the Occlusion LineMOD dataset - a 4.1% absolute improvement over the prior art, and comparable result on the YCB-Video dataset with a much faster runtime.

Prerequisites

  • Python >= 3.6
  • Pytorch == 1.9.0
  • Torchvision == 0.10.0
  • CUDA == 10.1

Downloads

Installation

  1. Set up the python environment:
    $ pip install torch==1.9.0 torchvision==0.10.0
    $ pip install Cython==0.29.17
    $ sudo apt-get install libglfw3-dev libglfw3
    $ pip install -r requirements.txt
    
    # Install Differentiable Renderer
    $ cd renderer
    $ python3 setup.py install
    
  2. Compile cuda extensions under lib/csrc:
    ROOT=/path/to/RePOSE
    cd $ROOT/lib/csrc
    export CUDA_HOME="/usr/local/cuda-10.1"
    cd ../ransac_voting
    python setup.py build_ext --inplace
    cd ../camera_jacobian
    python setup.py build_ext --inplace
    cd ../nn
    python setup.py build_ext --inplace
    cd ../fps
    python setup.py
    
  3. Set up datasets:
    $ ROOT=/path/to/RePOSE
    $ cd $ROOT/data
    
    $ ln -s /path/to/linemod linemod
    $ ln -s /path/to/linemod_orig linemod_orig
    $ ln -s /path/to/occlusion_linemod occlusion_linemod
    
    $ cd $ROOT/data/model/
    $ unzip pretrained_models.zip
    
    $ cd $ROOT/cache/LinemodTest
    $ unzip ape.zip benchvise.zip .... phone.zip
    $ cd $ROOT/cache/LinemodOccTest
    $ unzip ape.zip can.zip .... holepuncher.zip
    

Testing

We have 13 categories (ape, benchvise, cam, can, cat, driller, duck, eggbox, glue, holepuncher, iron, lamp, phone) on the LineMOD dataset and 8 categories (ape, can, cat, driller, duck, eggbox, glue, holepuncher) on the Occlusion LineMOD dataset. Please choose the one category you like (replace ape with another category) and perform testing.

Evaluate the ADD(-S) score

  1. Generate the annotation data:
    python run.py --type linemod cls_type ape model ape
    
  2. Test:
    # Test on the LineMOD dataset
    $ python run.py --type evaluate --cfg_file configs/linemod.yaml cls_type ape model ape
    
    # Test on the Occlusion LineMOD dataset
    $ python run.py --type evaluate --cfg_file configs/linemod.yaml test.dataset LinemodOccTest cls_type ape model ape
    

Visualization

  1. Generate the annotation data:
    python run.py --type linemod cls_type ape model ape
    
  2. Visualize:
    # Visualize the results of the LineMOD dataset
    python run.py --type visualize --cfg_file configs/linemod.yaml cls_type ape model ape
    
    # Visualize the results of the Occlusion LineMOD dataset
    python run.py --type visualize --cfg_file configs/linemod.yaml test.dataset LinemodOccTest cls_type ape model ape
    

Citation

@InProceedings{Iwase_2021_ICCV,
    author    = {Iwase, Shun and Liu, Xingyu and Khirodkar, Rawal and Yokota, Rio and Kitani, Kris M.},
    title     = {RePOSE: Fast 6D Object Pose Refinement via Deep Texture Rendering},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {3303-3312}
}

Acknowledgement

Our code is largely based on clean-pvnet and our rendering code is based on neural_renderer. Thank you so much for making these codes publicly available!

Contact

If you have any questions about the paper and implementation, please feel free to email me ([email protected])! Thank you!

Owner
Shun Iwase
Carnegie Mellon University, Robotics Institute
Shun Iwase
TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL, and utterance id

TEDSummary is a speech summary corpus. It includes TED talks subtitle (Document), Title-Detail (Summary), speaker name (Meta info), MP4 URL

3 Dec 26, 2022
Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset (CVPR'19)

Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset (CVPR'19) Tianyu Wang*, Xin Yang*, Ke Xu, Shaozhe Chen, Qiang Zhang, Ry

Steve Wong 177 Dec 01, 2022
MapReader: A computer vision pipeline for the semantic exploration of maps at scale

MapReader A computer vision pipeline for the semantic exploration of maps at scale MapReader is an end-to-end computer vision (CV) pipeline designed b

Living with Machines 25 Dec 26, 2022
Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data

Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data This is the official PyTorch implementation of the SeCo paper: @articl

ElementAI 101 Dec 12, 2022
Denoising Diffusion Probabilistic Models

Denoising Diffusion Probabilistic Models Jonathan Ho, Ajay Jain, Pieter Abbeel Paper: https://arxiv.org/abs/2006.11239 Website: https://hojonathanho.g

Jonathan Ho 1.5k Jan 08, 2023
Point detection through multi-instance deep heatmap regression for sutures in endoscopy

Suture detection PyTorch This repo contains the reference implementation of suture detection model in PyTorch for the paper Point detection through mu

artificial intelligence in the area of cardiovascular healthcare 3 Jul 16, 2022
Notspot robot simulation - Python version

Notspot robot simulation - Python version This repository contains all the files and code needed to simulate the notspot quadrupedal robot using Gazeb

50 Sep 26, 2022
Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition

Light-SERNet This is the Tensorflow 2.x implementation of our paper "Light-SERNet: A lightweight fully convolutional neural network for speech emotion

Arya Aftab 29 Nov 12, 2022
Photo2cartoon - 人像卡通化探索项目 (photo-to-cartoon translation project)

人像卡通化 (Photo to Cartoon) 中文版 | English Version 该项目为小视科技卡通肖像探索项目。您可使用微信扫描下方二维码或搜索“AI卡通秀”小程序体验卡通化效果。

Minivision_AI 3.5k Dec 30, 2022
This repo includes the CUB-GHA (Gaze-based Human Attention) dataset and code of the paper "Human Attention in Fine-grained Classification".

HA-in-Fine-Grained-Classification This repo includes the CUB-GHA (Gaze-based Human Attention) dataset and code of the paper "Human Attention in Fine-g

16 Oct 29, 2022
GANimation: Anatomically-aware Facial Animation from a Single Image (ECCV'18 Oral) [PyTorch]

GANimation: Anatomically-aware Facial Animation from a Single Image [Project] [Paper] Official implementation of GANimation. In this work we introduce

Albert Pumarola 1.8k Dec 28, 2022
Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

RIIT Our open-source code for RIIT: Rethinking the Importance of Implementation Tricks in Multi-AgentReinforcement Learning. We implement and standard

405 Jan 06, 2023
Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation Requirements This repository needs mmsegmentation Training To train

20 May 28, 2022
Autonomous Movement from Simultaneous Localization and Mapping

Autonomous Movement from Simultaneous Localization and Mapping About us Built by a group of Clarkson University students with the help from Professor

14 Nov 07, 2022
ICLR 2021: Pre-Training for Context Representation in Conversational Semantic Parsing

SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing This repository contains code for the ICLR 2021 paper "SCoRE: Pre-Tr

Microsoft 28 Oct 02, 2022
On the model-based stochastic value gradient for continuous reinforcement learning

On the model-based stochastic value gradient for continuous reinforcement learning This repository is by Brandon Amos, Samuel Stanton, Denis Yarats, a

Facebook Research 46 Dec 15, 2022
Audio-Visual Generalized Few-Shot Learning with Prototype-Based Co-Adaptation

Audio-Visual Generalized Few-Shot Learning with Prototype-Based Co-Adaptation The code repository for "Audio-Visual Generalized Few-Shot Learning with

Kaiaicy 3 Jun 27, 2022
Data-depth-inference - Data depth inference with python

Welcome! This readme will guide you through the use of the code in this reposito

Marco 3 Feb 08, 2022
This is the repo for the paper `SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization'. (published in Bioinformatics'21)

SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization This is the code for our paper ``SumGNN: Multi-typed Drug

Yue Yu 58 Dec 21, 2022
Attentive Implicit Representation Networks (AIR-Nets)

Attentive Implicit Representation Networks (AIR-Nets) Preprint | Supplementary | Accepted at the International Conference on 3D Vision (3DV) teaser.mo

29 Dec 07, 2022