Extreme Rotation Estimation using Dense Correlation Volumes

Overview

Extreme Rotation Estimation using Dense Correlation Volumes

This repository contains a PyTorch implementation of the paper:

Extreme Rotation Estimation using Dense Correlation Volumes [Project page] [Arxiv]

Ruojin Cai, Bharath Hariharan, Noah Snavely, Hadar Averbuch-Elor

CVPR 2021

Introduction

We present a technique for estimating the relative 3D rotation of an RGB image pair in an extreme setting, where the images have little or no overlap. We observe that, even when images do not overlap, there may be rich hidden cues as to their geometric relationship, such as light source directions, vanishing points, and symmetries present in the scene. We propose a network design that can automatically learn such implicit cues by comparing all pairs of points between the two input images. Our method therefore constructs dense feature correlation volumes and processes these to predict relative 3D rotations. Our predictions are formed over a fine-grained discretization of rotations, bypassing difficulties associated with regressing 3D rotations. We demonstrate our approach on a large variety of extreme RGB image pairs, including indoor and outdoor images captured under different lighting conditions and geographic locations. Our evaluation shows that our model can successfully estimate relative rotations among non-overlapping images without compromising performance over overlapping image pairs.

Overview of our Method:

Overview

Given a pair of images, a shared-weight Siamese encoder extracts feature maps. We compute a 4D correlation volume using the inner product of features, from which our model predicts the relative rotation (here, as distributions over Euler angles).

Dependencies

# Create conda environment with python 3.6, torch 1.3.1 and CUDA 10.0
conda env create -f ./tools/environment.yml
conda activate rota

Dataset

Perspective images are randomly sampled from panoramas with a resolution of 256 × 256 and a 90◦ FoV. We sample images distributed uniformly over the range of [−180, 180] for yaw angles. To avoid generating textureless images that focus on the ceiling/sky or the floor, we limit the range over pitch angles to [−30◦, 30◦] for the indoor datasets and [−45◦, 45◦] for the outdoor dataset.

Download InteriorNet, SUN360, and StreetLearn datasets to obtain the full panoramas.

Metadata files about the training and test image pairs are available in the following google drive: link. Download the metadata.zip file, unzip it and put it under the project root directory.

We base on this MATLAB Toolbox that extracts perspective images from an input panorama. Before running PanoBasic/pano2perspective_script.m, you need to modify the path to the datasets and metadata files in the script.

Pretrained Model

Pretrained models are be available in the following google drive: link. To use the pretrained models, download the pretrained.zip file, unzip it and put it under the project root directory.

Testing the pretrained model:

The following commands test the performance of the pre-trained models in the rotation estimation task. The commands output the mean and median geodesic error, and the percentage of pairs with a relative rotation error under 10◦ for different levels of overlap on the test set.

# Usage:
# python test.py <config> --pretrained <checkpoint_filename>

python test.py configs/sun360/sun360_cv_distribution.yaml \
    --pretrained pretrained/sun360_cv_distribution.pt

python test.py configs/interiornet/interiornet_cv_distribution.yaml \
    --pretrained pretrained/interiornet_cv_distribution.pt

python test.py configs/interiornetT/interiornetT_cv_distribution.yaml \
    --pretrained pretrained/interiornetT_cv_distribution.pt

python test.py configs/streetlearn/streetlearn_cv_distribution.yaml \
    --pretrained pretrained/streetlearn_cv_distribution.pt

python test.py configs/streetlearnT/streetlearnT_cv_distribution.yaml \
    --pretrained pretrained/streetlearnT_cv_distribution.pt

Rotation estimation evaluation of the pretrained models is as follows:

InteriorNet InteriorNet-T SUM360 StreetLearn StreetLearn-T
Avg(°) Med(°) 10° Avg(°) Med(°) 10° Avg(°) Med(°) 10° Avg(°) Med(°) 10° Avg(°) Med(°) 10°
Large 1.82 0.88 98.76% 8.86 1.86 93.13% 1.37 1.09 99.51% 1.38 1.12 100.00% 24.98 2.50 78.95%
Small 4.31 1.16 96.58% 30.43 2.63 74.07% 6.13 1.77 95.86% 3.25 1.41 98.34% 27.84 3.19 74.76%
None 37.69 3.15 61.97% 49.44 4.17 58.36% 34.92 4.43 61.39% 5.46 1.65 96.60% 32.43 3.64 72.69%
All 13.49 1.18 86.90% 29.68 2.58 75.10% 20.45 2.23 78.30% 4.10 1.46 97.70% 29.85 3.19 74.30%

Training

# Usage:
# python train.py <config>

python train.py configs/interiornet/interiornet_cv_distribution.yaml

python train.py configs/interiornetT/interiornetT_cv_distribution.yaml

python train.py configs/sun360/sun360_cv_distribution_overlap.yaml
python train.py configs/sun360/sun360_cv_distribution.yaml --resume --pretrained <checkpoint_filename>

python train.py configs/streetlearn/streetlearn_cv_distribution_overlap.yaml
python train.py configs/streetlearn/streetlearn_cv_distribution.yaml --resume --pretrained <checkpoint_filename>

python train.py configs/streetlearnT/streetlearnT_cv_distribution_overlap.yaml
python train.py configs/streetlearnT/streetlearnT_cv_distribution.yaml --resume --pretrained <checkpoint_filename>

For SUN360 and StreetLearn dataset, finetune from the pretrained model, which is training with only overlapping pairs, at epoch 10. More configs about baselines can be found in the folder configs/sun360.

Cite

Please cite our work if you find it useful:

@inproceedings{Cai2021Extreme,
 title={Extreme Rotation Estimation using Dense Correlation Volumes},
 author={Cai, Ruojin and Hariharan, Bharath and Snavely, Noah and Averbuch-Elor, Hadar},
 booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
 year={2021}
}

Acknowledgment

This work was supported in part by the National Science Foundation (IIS-2008313) and by the generosity of Eric and Wendy Schmidt by recommendation of the Schmidt Futures program and the Zuckerman STEM leadership program.

Owner
Ruojin Cai
Ph.D. student at Cornell University
Ruojin Cai
Pytorch implementation for "Implicit Feature Alignment: Learn to Convert Text Recognizer to Text Spotter".

Implicit Feature Alignment: Learn to Convert Text Recognizer to Text Spotter This is a pytorch-based implementation for paper Implicit Feature Alignme

wangtianwei 61 Nov 12, 2022
Multiple-criteria decision-making (MCDM) with Electre, Promethee, Weighted Sum and Pareto

EasyMCDM - Quick Installation methods Install with PyPI Once you have created your Python environment (Python 3.6+) you can simply type: pip3 install

Labrak Yanis 6 Nov 22, 2022
On Evaluation Metrics for Graph Generative Models

On Evaluation Metrics for Graph Generative Models Authors: Rylee Thompson, Boris Knyazev, Elahe Ghalebi, Jungtaek Kim, Graham Taylor This is the offic

13 Jan 07, 2023
Plugin for Gaffer providing direct acess to asset from PolyHaven.com. Only HDRIs at the moment, Cycles and Arnold supported

GafferHaven Plugin for Gaffer providing direct acess to asset from PolyHaven.com. Only HDRIs are supported at the moment, in Cycles and Arnold lights.

Jakub Vondra 6 Jan 26, 2022
[ICCV 2021] FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

FaPN: Feature-aligned Pyramid Network for Dense Image Prediction [arXiv] [Project Page] @inproceedings{ huang2021fapn, title={{FaPN}: Feature-alig

Shihua Huang 23 Jul 22, 2022
A collection of random and hastily hacked together scripts for investigating EU-DCC

A collection of random and hastily hacked together scripts for investigating EU-DCC

Ryan Barrett 8 Mar 01, 2022
GANTheftAuto is a fork of the Nvidia's GameGAN

Description GANTheftAuto is a fork of the Nvidia's GameGAN, which is research focused on emulating dynamic game environments. The early research done

Harrison 801 Dec 27, 2022
yolov5目标检测模型的知识蒸馏(基于响应的蒸馏)

代码地址: https://github.com/Sharpiless/yolov5-knowledge-distillation 教师模型: python train.py --weights weights/yolov5m.pt \ --cfg models/yolov5m.ya

52 Dec 04, 2022
HW3 ― GAN, ACGAN and UDA

HW3 ― GAN, ACGAN and UDA In this assignment, you are given datasets of human face and digit images. You will need to implement the models of both GAN

grassking100 1 Dec 13, 2021
Categorical Depth Distribution Network for Monocular 3D Object Detection

CaDDN CaDDN is a monocular-based 3D object detection method. This repository is based off of [OpenPCDet]. Categorical Depth Distribution Network for M

Toronto Robotics and AI Laboratory 289 Jan 05, 2023
Paaster is a secure by default end-to-end encrypted pastebin built with the objective of simplicity.

Follow the development of our desktop client here Paaster Paaster is a secure by default end-to-end encrypted pastebin built with the objective of sim

Ward 211 Dec 25, 2022
Creating a Linear Program Solver by Implementing the Simplex Method in Python with NumPy

Creating a Linear Program Solver by Implementing the Simplex Method in Python with NumPy Simplex Algorithm is a popular algorithm for linear programmi

Reda BELHAJ 2 Oct 12, 2022
[ICCV 2021] Deep Hough Voting for Robust Global Registration

Deep Hough Voting for Robust Global Registration, ICCV, 2021 Project Page | Paper | Video Deep Hough Voting for Robust Global Registration Junha Lee1,

Junha Lee 10 Dec 02, 2022
The 2nd place solution of 2021 google landmark retrieval on kaggle.

Google_Landmark_Retrieval_2021_2nd_Place_Solution The 2nd place solution of 2021 google landmark retrieval on kaggle. Environment We use cuda 11.1/pyt

229 Dec 13, 2022
This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset.

DeepLab-ResNet-TensorFlow This is an (re-)implementation of DeepLab-ResNet in TensorFlow for semantic image segmentation on the PASCAL VOC dataset. Up

19 Jan 16, 2022
Implementation of the paper titled "Using Sampling to Estimate and Improve Performance of Automated Scoring Systems with Guarantees"

Using Sampling to Estimate and Improve Performance of Automated Scoring Systems with Guarantees Implementation of the paper titled "Using Sampling to

MIDAS, IIIT Delhi 2 Aug 29, 2022
"Learning Free Gait Transition for Quadruped Robots vis Phase-Guided Controller"

PhaseGuidedControl The current version is developed based on the old version of RaiSim series, and possibly requires further modification. It will be

X-Mechanics 12 Oct 21, 2022
The implementation of CVPR2021 paper Temporal Query Networks for Fine-grained Video Understanding, by Chuhan Zhang, Ankush Gupta and Andrew Zisserman.

Temporal Query Networks for Fine-grained Video Understanding 📋 This repository contains the implementation of CVPR2021 paper Temporal_Query_Networks

55 Dec 21, 2022
Learning 3D Part Assembly from a Single Image

Learning 3D Part Assembly from a Single Image This repository contains a PyTorch implementation of the paper: Learning 3D Part Assembly from A Single

18 Dec 21, 2022
Code for reproducing key results in the paper "InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets"

Status: Archive (code is provided as-is, no updates expected) InfoGAN Code for reproducing key results in the paper InfoGAN: Interpretable Representat

OpenAI 1k Dec 19, 2022