CTRL-C: Camera calibration TRansformer with Line-Classification

Last update: Nov 14, 2022

Related tags

Overview

CTRL-C: Camera calibration TRansformer with Line-Classification

This repository contains the official code and pretrained models for CTRL-C (Camera calibration TRansformer with Line-Classification). Jinwoo Lee, Hyunsung Go, Hyunjoon Lee, Sunghyun Cho, Minhyuk Sung and Junho Kim. ICCV 2021.

Single image camera calibration is the task of estimating the camera parameters from a single input image, such as the vanishing points, focal length, and horizon line. In this work, we propose Camera calibration TRansformer with Line-Classification (CTRL-C), an end-to-end neural network-based approach to single image camera calibration, which directly estimates the camera parameters from an image and a set of line segments. Our network adopts the transformer architecture to capture the global structure of an image with multi-modal inputs in an end-to-end manner. We also propose an auxiliary task of line classification to train the network to extract the global geometric information from lines effectively. Our experiments demonstrate that CTRL-C outperforms the previous state-of-the-art methods on the Google Street View and SUN360 benchmark datasets.

Results & Checkpoints

Dataset	Up Dir (◦)	Pitch (◦)	Roll (◦)	FoV (◦)	AUC (%)	URL
Google Street View	1.80	1.58	0.66	3.59	87.29	gdrive
SUN360	1.91	1.50	0.96	3.80	85.45	gdrive

Preparation

Clone this repository

Setup environments

conda create -n ctrlc python
conda activate ctrlc
conda install -c pytorch torchvision

pip install -r requrements.txt

Training Datasets

Google Street View dataset
SUN360 dataset
- You need to preprocess the dataset

Training

Single GPU

python main.py --config-file 'config-files/ctrl-c.yaml' --opts OUTPUT_DIR 'logs'

Multi GPU

python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --config-file 'config-files/ctrl-c.yaml' --opts OUTPUT_DIR 'logs'

Evaluation

python test.py --dataset 'GoogleStreetView' --opts OUTPUT_DIR 'outputs'

Citation

If you use this code for your research, please cite our paper:

@InProceedings{Lee:2021:ICCV,
    Title     = {{CTRL-C: Camera calibration TRansformer with Line-Classification}},
    Author    = {Jinwoo Lee and Hyunsung Go and Hyunjoon Lee and Sunghyun Cho and Minhyuk Sung and Junho Kim},    
    Booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    Year      = {2021},
}

License

CTRL-C is released under the Apache 2.0 license. Please see the LICENSE file for more information.

Acknowledgments

This code is based on the implementations of DETR: End-to-End Object Detection with Transformers.

CTRL-C: Camera calibration TRansformer with Line-Classification

Related tags

Overview

CTRL-C: Camera calibration TRansformer with Line-Classification

Results & Checkpoints

Preparation

Training Datasets

Training

Evaluation

Citation

License

Acknowledgments

Owner

Implementation of E(n)-Transformer, which extends the ideas of Welling's E(n)-Equivariant Graph Neural Network to attention

InDuDoNet+: A Model-Driven Interpretable Dual Domain Network for Metal Artifact Reduction in CT Images

The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

Code for the ICASSP-2021 paper: Continuous Speech Separation with Conformer.

Defending graph neural networks against adversarial attacks (NeurIPS 2020)

Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)

Implementation for Simple Spectral Graph Convolution in ICLR 2021

SOTA model in CIFAR10

This repository introduces a short project about Transfer Learning for Classification of MRI Images.

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

Complete-IoU (CIoU) Loss and Cluster-NMS for Object Detection and Instance Segmentation (YOLACT)

Starter code for the ICCV 2021 paper, 'Detecting Invisible People'

DAT4 - General Assembly's Data Science course in Washington, DC

Semi-supervised learning for object detection

Bachelor's Thesis in Computer Science: Privacy-Preserving Federated Learning Applied to Decentralized Data

Automatic learning-rate scheduler

Perception-aware multi-sensor fusion for 3D LiDAR semantic segmentation (ICCV 2021)

[WACV 2020] Reducing Footskate in Human Motion Reconstruction with Ground Contact Constraints

Code for "ATISS: Autoregressive Transformers for Indoor Scene Synthesis", NeurIPS 2021

The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment