Joint deep network for feature line detection and description

Related tags

Deep LearningSOLD2
Overview

SOLD² - Self-supervised Occlusion-aware Line Description and Detection

This repository contains the implementation of the paper: SOLD² : Self-supervised Occlusion-aware Line Description and Detection, J-T. Lin*, R. Pautrat*, V. Larsson, M. Oswald and M. Pollefeys (Oral at CVPR 2021).

SOLD² is a deep line segment detector and descriptor that can be trained without hand-labelled line segments and that can robustly match lines even in the presence of occlusion.

Demos

Matching in the presence of occlusion: demo_occlusion

Matching with a moving camera: demo_moving_camera

Usage

Installation

We recommend using this code in a Python environment (e.g. venv or conda). The following script installs the necessary requirements with pip:

pip install -r requirements.txt

Set your dataset and experiment paths (where you will store your datasets and checkpoints of your experiments) by modifying the file config/project_config.py. Both variables DATASET_ROOT and EXP_PATH have to be set.

You can download the version of the Wireframe dataset that we used during our training and testing here. This repository also includes some files to train on the Holicity dataset to add more outdoor images, but note that we did not extensively test this dataset and the original paper was based on the Wireframe dataset only.

Training your own model

All training parameters are located in configuration files in the folder config. Training SOLD² from scratch requires several steps, some of which taking several days, depending on the size of your dataset.

Step 1: Train on a synthetic dataset

The following command will create the synthetic dataset and start training the model on it:

python experiment.py --mode train --dataset_config config/synthetic_dataset.yaml --model_config config/train_detector.yaml --exp_name sold2_synth
Step 2: Export the raw pseudo ground truth on the Wireframe dataset with homography adaptation

Note that this step can take one to several days depending on your machine and on the size of the dataset. You can set the batch size to the maximum capacity that your GPU can handle.

python experiment.py --exp_name wireframe_train --mode export --resume_path <path to your previously trained sold2_synth> --model_config config/train_detector.yaml --dataset_config config/wireframe_dataset.yaml --checkpoint_name <name of the best checkpoint> --export_dataset_mode train --export_batch_size 4

You can similarly perform the same for the test set:

python experiment.py --exp_name wireframe_test --mode export --resume_path <path to your previously trained sold2_synth> --model_config config/train_detector.yaml --dataset_config config/wireframe_dataset.yaml --checkpoint_name <name of the best checkpoint> --export_dataset_mode test --export_batch_size 4
Step3: Compute the ground truth line segments from the raw data
cd postprocess
python convert_homography_results.py <name of the previously exported file (e.g. "wireframe_train.h5")> <name of the new data with extracted line segments (e.g. "wireframe_train_gt.h5")> ../config/export_line_features.yaml
cd ..

We recommend testing the results on a few samples of your dataset to check the quality of the output, and modifying the hyperparameters if need be. Using a detect_thresh=0.5 and inlier_thresh=0.99 proved to be successful for the Wireframe dataset in our case for example.

Step 4: Train the detector on the Wireframe dataset

We found it easier to pretrain the detector alone first, before fine-tuning it with the descriptor part. Uncomment the lines 'gt_source_train' and 'gt_source_test' in config/wireframe_dataset.yaml and fill them with the path to the h5 file generated in the previous step.

python experiment.py --mode train --dataset_config config/wireframe_dataset.yaml --model_config config/train_detector.yaml --exp_name sold2_wireframe

Alternatively, you can also fine-tune the already trained synthetic model:

python experiment.py --mode train --dataset_config config/wireframe_dataset.yaml --model_config config/train_detector.yaml --exp_name sold2_wireframe --pretrained --pretrained_path <path ot the pre-trained sold2_synth> --checkpoint_name <name of the best checkpoint>

Lastly, you can resume a training that was stopped:

python experiment.py --mode train --dataset_config config/wireframe_dataset.yaml --model_config config/train_detector.yaml --exp_name sold2_wireframe --resume --resume_path <path to the  model to resume> --checkpoint_name <name of the last checkpoint>
Step 5: Train the full pipeline on the Wireframe dataset

You first need to modify the field 'return_type' in config/wireframe_dataset.yaml to 'paired_desc'. The following command will then train the full model (detector + descriptor) on the Wireframe dataset:

python experiment.py --mode train --dataset_config config/wireframe_dataset.yaml --model_config config/train_full_pipeline.yaml --exp_name sold2_full_wireframe --pretrained --pretrained_path <path ot the pre-trained sold2_wireframe> --checkpoint_name <name of the best checkpoint>

Pretrained models

We provide the checkpoints of two pretrained models:

How to use it

We provide a notebook showing how to use the trained model of SOLD². Additionally, you can use the model to export line features (segments and descriptor maps) as follows:

python export_line_features.py --img_list <list to a txt file containing the path to all the images> --output_folder <path to the output folder> --checkpoint_path <path to your best checkpoint,>

You can tune some of the line detection parameters in config/export_line_features.yaml, in particular the 'detect_thresh' and 'inlier_thresh' to adapt them to your trained model and type of images.

Results

Comparison of repeatability and localization error to the state of the art on the Wireframe dataset for an error threshold of 5 pixels in structural and orthogonal distances:

Structural distance Orthogonal distance
Rep-5 Loc-5 Rep-5 Loc-5
LCNN 0.434 2.589 0.570 1.725
HAWP 0.451 2.625 0.537 1.725
DeepHough 0.419 2.576 0.618 1.720
TP-LSD TP512 0.563 2.467 0.746 1.450
LSD 0.358 2.079 0.707 0.825
Ours with NMS 0.557 1.995 0.801 1.119
Ours 0.616 2.019 0.914 0.816

Matching precision-recall curves on the Wireframe and ETH3D datasets: pred_lines_pr_curve

Bibtex

If you use this code in your project, please consider citing the following paper:

@InProceedings{Pautrat_Lin_2021_CVPR,
    author = {Pautrat, Rémi* and Juan-Ting, Lin* and Larsson, Viktor and Oswald, Martin R. and Pollefeys, Marc},
    title = {SOLD²: Self-supervised Occlusion-aware Line Description and Detection},
    booktitle = {Computer Vision and Pattern Recognition (CVPR)},
    year = {2021},
}
Owner
Computer Vision and Geometry Lab
Computer Vision and Geometry Lab
Make your master artistic punk avatar through machine learning world famous paintings.

Master-art-punk Make your master artistic punk avatar through machine learning world famous paintings. 通过机器学习世界名画制作属于你的大师级艺术朋克头像 Nowadays, NFT is beco

Philipjhc 53 Dec 27, 2022
This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Predicting Patient Outcomes with Graph Representation Learning This repository contains the code used for Predicting Patient Outcomes with Graph Repre

Emma Rocheteau 76 Dec 22, 2022
MlTr: Multi-label Classification with Transformer

MlTr: Multi-label Classification with Transformer This is official implement of "MlTr: Multi-label Classification with Transformer". Abstract The task

程星 38 Nov 08, 2022
This is the official source code of "BiCAT: Bi-Chronological Augmentation of Transformer for Sequential Recommendation".

BiCAT This is our TensorFlow implementation for the paper: "BiCAT: Sequential Recommendation with Bidirectional Chronological Augmentation of Transfor

John 15 Dec 06, 2022
[CVPR 2022] TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing

TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing (CVPR 2022) This repository provides the official PyTorch impleme

Billy XU 128 Jan 03, 2023
Code for the paper "A Study of Face Obfuscation in ImageNet"

A Study of Face Obfuscation in ImageNet Code for the paper: A Study of Face Obfuscation in ImageNet Kaiyu Yang, Jacqueline Yau, Li Fei-Fei, Jia Deng,

35 Oct 04, 2022
Model Zoo of BDD100K Dataset

Model Zoo of BDD100K Dataset

ETH VIS Group 200 Dec 27, 2022
PyElecCL - Electron Monte Carlo Second Checks

PyElecCL Python program to perform second checks for electron Monte Carlo radiat

Reese Haywood 3 Feb 22, 2022
This repository is a basic Machine Learning train & validation Template (Using PyTorch)

pytorch_ml_template This repository is a basic Machine Learning train & validation Template (Using PyTorch) TODO Markdown 사용법 Build Docker 사용법 Anacond

1 Sep 15, 2022
Simulations for Turring patterns on an apically expanding domain. T

Turing patterns on expanding domain Simulations for Turring patterns on an apically expanding domain. The details about the models and numerical imple

Yue Liu 0 Aug 03, 2021
Implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

CrossViT : Cross-Attention Multi-Scale Vision Transformer for Image Classification This is an unofficial PyTorch implementation of CrossViT: Cross-Att

Rishikesh (ऋषिकेश) 103 Nov 25, 2022
A Simplied Framework of GAN Inversion

Framework of GAN Inversion Introcuction You can implement your own inversion idea using our repo. We offer a full range of tuning settings (in hparams

Kangneng Zhou 13 Sep 27, 2022
A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

WaveGlow A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis Quick Start: Install requirements: pip install

Yuchao Zhang 204 Jul 14, 2022
Pytorch implementation of One-Shot Affordance Detection

One-shot Affordance Detection PyTorch implementation of our one-shot affordance detection models. This repository contains PyTorch evaluation code, tr

46 Dec 12, 2022
EfficientNetv2 TensorRT int8

EfficientNetv2_TensorRT_int8 EfficientNetv2模型实现来自https://github.com/d-li14/efficientnetv2.pytorch 环境配置 ubuntu:18.04 cuda:11.0 cudnn:8.0 tensorrt:7

34 Apr 24, 2022
Creating Multi Task Models With Keras

Creating Multi Task Models With Keras About The Project! I used the keras and Tensorflow Library, To build a Deep Learning Neural Network to Creating

Srajan Chourasia 4 Nov 28, 2022
FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks

FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks

HKBU High Performance Machine Learning Lab 6 Nov 18, 2022
VGGFace2-HQ - A high resolution face dataset for face editing purpose

The first open source high resolution dataset for face swapping!!! A high resolution version of VGGFace2 for academic face editing purpose

Naiyuan Liu 232 Dec 29, 2022
Reproduction of Vision Transformer in Tensorflow2. Train from scratch and Finetune.

Vision Transformer(ViT) in Tensorflow2 Tensorflow2 implementation of the Vision Transformer(ViT). This repository is for An image is worth 16x16 words

sungjun lee 42 Dec 27, 2022