Learning Camera Localization via Dense Scene Matching, CVPR2021

Last update: Dec 01, 2022

Related tags

Overview

This repository contains code of our CVPR 2021 paper - "Learning Camera Localization via Dense Scene Matching" by Shitao Tang, Chengzhou Tang, Rui Huang, Siyu Zhu and Ping Tan.

This paper presents a new method for scene agnostic camera localization using dense scene matching (DSM), where a cost volume is constructed between a query image and a scene. The cost volume and the corresponding coordinates are processed by a CNN to predict dense coordinates. Camera poses can then be solved by PnP algorithms.

If you find this project useful, please cite:

@inproceedings{Tang2021Learning,
  title={Learning Camera Localization via Dense Scene Matching},
  author={Shitao Tang, Chengzhou Tang, Rui Huang, Siyu Zhu and Ping Tan},
  booktitle={Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

Usage

Environment

The codes are tested along with
- pytorch=1.4.0
- lmdb (optional)
- yaml
- skimage
- opencv
- numpy=1.17
- tensorboard

Installation

Build PyTorch operations

  cd libs/model/ops
  python setup.py install

Build PnP algorithm

  cd libs/utils/lm_pnp
  mkdir build
  cd build
  cmake ..
  make all

Train and Test

Download

You can download the trained models and label files for 7scenes, Cambridge, Scannet.

For 7scenes, you can use the prepared data in the following.

Chess Fire Heads Office Pumpkin Kitchen Stairs

For Cambridge landmarks, you can download image files here, and depths here.
Test

Please refer to configs/7scenes.yaml for detailed explaination of how to set label file path and image file path.
- 7scenes
```
python tools/video_test.py --config configs/7scenes.yaml
```
- Camrbrige
```
python tools/video_test.py --config configs/cambridge.yaml
```
Train

We use ResNet-FPN pretrained model.
```
  python tools/train_net.py
```

Learning Camera Localization via Dense Scene Matching, CVPR2021

Related tags

Overview

Usage

Environment

Installation

Train and Test

Owner

tangshitao

This is a project to detect gestures to zoom in or out, using the real-time distance between the index finger and the thumb. It's based on OpenCV and Mediapipe.

Train custom VR face tracking parameters

OCR-D-compliant page segmentation

TextBoxes re-implement using tensorflow

Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation, CVPR 2020 (Oral)

PianoVisuals - Create background videos synced with piano music using opencv

The code for “Oriented RepPoints for Aerail Object Detection”

Handwritten Character Recognition using CNN

The code of "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes"

Python-based tools for document analysis and OCR

pyntcloud is a Python library for working with 3D point clouds.

Python library to extract tabular data from images and scanned PDFs

This is used to convert a string to an Image with Handwritten Characters.

OCR, Object Detection, Number Plate, Real Time

PyNeuro is designed to connect NeuroSky's MindWave EEG device to Python and provide Callback functionality to provide data to your application in real time.

Creating of virtual elements of the graphical interface using opencv and mediapipe.

Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

PyQT5 app that colorize black & white pictures using CNN(use pre-trained model which was made with OpenCV)

Satoshi is a discord bot template in python using discord.py that allow you to track some live crypto prices with your own discord bot.

learn how to use Gesture Control to change the volume of a computer