Learning Camera Localization via Dense Scene Matching, CVPR2021

Overview

This repository contains code of our CVPR 2021 paper - "Learning Camera Localization via Dense Scene Matching" by Shitao Tang, Chengzhou Tang, Rui Huang, Siyu Zhu and Ping Tan.

This paper presents a new method for scene agnostic camera localization using dense scene matching (DSM), where a cost volume is constructed between a query image and a scene. The cost volume and the corresponding coordinates are processed by a CNN to predict dense coordinates. Camera poses can then be solved by PnP algorithms.

If you find this project useful, please cite:

@inproceedings{Tang2021Learning,
  title={Learning Camera Localization via Dense Scene Matching},
  author={Shitao Tang, Chengzhou Tang, Rui Huang, Siyu Zhu and Ping Tan},
  booktitle={Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

Usage

Environment

  • The codes are tested along with
    • pytorch=1.4.0
    • lmdb (optional)
    • yaml
    • skimage
    • opencv
    • numpy=1.17
    • tensorboard

Installation

  • Build PyTorch operations
      cd libs/model/ops
      python setup.py install
    
  • Build PnP algorithm
      cd libs/utils/lm_pnp
      mkdir build
      cd build
      cmake ..
      make all
    

Train and Test

  • Download

    You can download the trained models and label files for 7scenes, Cambridge, Scannet.

    For 7scenes, you can use the prepared data in the following.

    Chess Fire Heads Office Pumpkin Kitchen Stairs

    For Cambridge landmarks, you can download image files here, and depths here.

  • Test

    Please refer to configs/7scenes.yaml for detailed explaination of how to set label file path and image file path.

    • 7scenes
      python tools/video_test.py --config configs/7scenes.yaml
      
    • Camrbrige
      python tools/video_test.py --config configs/cambridge.yaml
      
  • Train

    We use ResNet-FPN pretrained model.

      python tools/train_net.py
    
Owner
tangshitao
tangshitao
The CIS OCR PostCorrectionTool

The CIS OCR Post Correction Tool PoCoTo Source code for the Java-based PoCoTo client enabling fast interactive batch corrections of complete OCR error

CIS OCR Group 36 Dec 15, 2022
Simple app for visual editing of Page XML files

Name nw-page-editor - Simple app for visual editing of Page XML files. Version: 2021.02.22 Description nw-page-editor is an application for viewing/ed

Mauricio Villegas 27 Jun 20, 2022
nofacedb/faceprocessor is a face recognition engine for NoFaceDB program complex.

faceprocessor nofacedb/faceprocessor is a face recognition engine for NoFaceDB program complex. Tech faceprocessor uses a number of open source projec

NoFaceDB 3 Sep 06, 2021
POT : Python Optimal Transport

This open source Python library provide several solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning.

Python Optimal Transport 1.7k Jan 04, 2023
Code for the ACL2021 paper "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction"

CSCBLI Code for our ACL Findings 2021 paper, "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction". Require

Jinpeng Zhang 12 Oct 08, 2022
Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts

LayoutAnalysisEvaluator Layout Analysis Evaluator for: ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records ICD

17 Dec 08, 2022
Course material for the Multi-agents and computer graphics course

TC2008B Course material for the Multi-agents and computer graphics course. Setup instructions Strongly recommend using a custom conda environment. Ins

16 Dec 13, 2022
An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come

An Agnostic Object Detection Framework IceVision is the first agnostic computer vision framework to offer a curated collection with hundreds of high-q

airctic 790 Jan 05, 2023
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp

Jacobo José Guijarro Villalba 75 Oct 21, 2022
Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.

Convolutional Recurrent Neural Network This software implements the Convolutional Recurrent Neural Network (CRNN), a combination of CNN, RNN and CTC l

Baoguang Shi 2k Dec 31, 2022
This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

TransFG: A Transformer Architecture for Fine-grained Recognition Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-gra

Ju He 307 Jan 03, 2023
Controlling Volume by Hand Gestures

This program allows the user to control the volume of their device with specific hand gestures involving their thumb and index finger!

Riddhi Bajaj 1 Nov 11, 2021
Um simples projeto para fazer o reconhecimento do captcha usado pelo jogo bombcrypto

CaptchaSolver - LEIA ISSO 😓 Para iniciar o codigo: pip install -r requirements.txt python captcha_solver.py Se você deseja pegar ver o resultado das

Kawanderson 50 Mar 21, 2022
Toolbox for OCR post-correction

Ochre Ochre is a toolbox for OCR post-correction. Please note that this software is experimental and very much a work in progress! Overview of OCR pos

National Library of the Netherlands / Research 117 Nov 10, 2022
list all open dataset about ocr.

ocr-open-dataset list all open dataset about ocr. printed dataset year Born-Digital Images (Web and Email) 2011-2015 COCO-Text 2017 Text Extraction fr

hongbomin 95 Nov 24, 2022
An organized collection of tutorials and projects created for aspriring computer vision students.

A repository created with the purpose of teaching students in BME lab 308A- Hanoi University of Science and Technology

Givralnguyen 5 Nov 24, 2021
DouZero is a reinforcement learning framework for DouDizhu - 斗地主AI

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI

Kwai 3.1k Jan 05, 2023
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

Microsoft 235 Dec 22, 2022
A PyTorch implementation of ECCV2018 Paper: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes A PyTorch implement of TextSnake: A Flexible Representation for Detecting

Prince Wang 417 Dec 12, 2022
[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

Grounded Situation Recognition with Transformers Paper | Model Checkpoint This is the official PyTorch implementation of Grounded Situation Recognitio

Junhyeong Cho 18 Jul 19, 2022