Joint deep network for feature line detection and description

Related tags

Deep LearningSOLD2
Overview

SOLD² - Self-supervised Occlusion-aware Line Description and Detection

This repository contains the implementation of the paper: SOLD² : Self-supervised Occlusion-aware Line Description and Detection, J-T. Lin*, R. Pautrat*, V. Larsson, M. Oswald and M. Pollefeys (Oral at CVPR 2021).

SOLD² is a deep line segment detector and descriptor that can be trained without hand-labelled line segments and that can robustly match lines even in the presence of occlusion.

Demos

Matching in the presence of occlusion: demo_occlusion

Matching with a moving camera: demo_moving_camera

Usage

Installation

We recommend using this code in a Python environment (e.g. venv or conda). The following script installs the necessary requirements with pip:

pip install -r requirements.txt

Set your dataset and experiment paths (where you will store your datasets and checkpoints of your experiments) by modifying the file config/project_config.py. Both variables DATASET_ROOT and EXP_PATH have to be set.

You can download the version of the Wireframe dataset that we used during our training and testing here. This repository also includes some files to train on the Holicity dataset to add more outdoor images, but note that we did not extensively test this dataset and the original paper was based on the Wireframe dataset only.

Training your own model

All training parameters are located in configuration files in the folder config. Training SOLD² from scratch requires several steps, some of which taking several days, depending on the size of your dataset.

Step 1: Train on a synthetic dataset

The following command will create the synthetic dataset and start training the model on it:

python experiment.py --mode train --dataset_config config/synthetic_dataset.yaml --model_config config/train_detector.yaml --exp_name sold2_synth
Step 2: Export the raw pseudo ground truth on the Wireframe dataset with homography adaptation

Note that this step can take one to several days depending on your machine and on the size of the dataset. You can set the batch size to the maximum capacity that your GPU can handle.

python experiment.py --exp_name wireframe_train --mode export --resume_path <path to your previously trained sold2_synth> --model_config config/train_detector.yaml --dataset_config config/wireframe_dataset.yaml --checkpoint_name <name of the best checkpoint> --export_dataset_mode train --export_batch_size 4

You can similarly perform the same for the test set:

python experiment.py --exp_name wireframe_test --mode export --resume_path <path to your previously trained sold2_synth> --model_config config/train_detector.yaml --dataset_config config/wireframe_dataset.yaml --checkpoint_name <name of the best checkpoint> --export_dataset_mode test --export_batch_size 4
Step3: Compute the ground truth line segments from the raw data
cd postprocess
python convert_homography_results.py <name of the previously exported file (e.g. "wireframe_train.h5")> <name of the new data with extracted line segments (e.g. "wireframe_train_gt.h5")> ../config/export_line_features.yaml
cd ..

We recommend testing the results on a few samples of your dataset to check the quality of the output, and modifying the hyperparameters if need be. Using a detect_thresh=0.5 and inlier_thresh=0.99 proved to be successful for the Wireframe dataset in our case for example.

Step 4: Train the detector on the Wireframe dataset

We found it easier to pretrain the detector alone first, before fine-tuning it with the descriptor part. Uncomment the lines 'gt_source_train' and 'gt_source_test' in config/wireframe_dataset.yaml and fill them with the path to the h5 file generated in the previous step.

python experiment.py --mode train --dataset_config config/wireframe_dataset.yaml --model_config config/train_detector.yaml --exp_name sold2_wireframe

Alternatively, you can also fine-tune the already trained synthetic model:

python experiment.py --mode train --dataset_config config/wireframe_dataset.yaml --model_config config/train_detector.yaml --exp_name sold2_wireframe --pretrained --pretrained_path <path ot the pre-trained sold2_synth> --checkpoint_name <name of the best checkpoint>

Lastly, you can resume a training that was stopped:

python experiment.py --mode train --dataset_config config/wireframe_dataset.yaml --model_config config/train_detector.yaml --exp_name sold2_wireframe --resume --resume_path <path to the  model to resume> --checkpoint_name <name of the last checkpoint>
Step 5: Train the full pipeline on the Wireframe dataset

You first need to modify the field 'return_type' in config/wireframe_dataset.yaml to 'paired_desc'. The following command will then train the full model (detector + descriptor) on the Wireframe dataset:

python experiment.py --mode train --dataset_config config/wireframe_dataset.yaml --model_config config/train_full_pipeline.yaml --exp_name sold2_full_wireframe --pretrained --pretrained_path <path ot the pre-trained sold2_wireframe> --checkpoint_name <name of the best checkpoint>

Pretrained models

We provide the checkpoints of two pretrained models:

How to use it

We provide a notebook showing how to use the trained model of SOLD². Additionally, you can use the model to export line features (segments and descriptor maps) as follows:

python export_line_features.py --img_list <list to a txt file containing the path to all the images> --output_folder <path to the output folder> --checkpoint_path <path to your best checkpoint,>

You can tune some of the line detection parameters in config/export_line_features.yaml, in particular the 'detect_thresh' and 'inlier_thresh' to adapt them to your trained model and type of images.

Results

Comparison of repeatability and localization error to the state of the art on the Wireframe dataset for an error threshold of 5 pixels in structural and orthogonal distances:

Structural distance Orthogonal distance
Rep-5 Loc-5 Rep-5 Loc-5
LCNN 0.434 2.589 0.570 1.725
HAWP 0.451 2.625 0.537 1.725
DeepHough 0.419 2.576 0.618 1.720
TP-LSD TP512 0.563 2.467 0.746 1.450
LSD 0.358 2.079 0.707 0.825
Ours with NMS 0.557 1.995 0.801 1.119
Ours 0.616 2.019 0.914 0.816

Matching precision-recall curves on the Wireframe and ETH3D datasets: pred_lines_pr_curve

Bibtex

If you use this code in your project, please consider citing the following paper:

@InProceedings{Pautrat_Lin_2021_CVPR,
    author = {Pautrat, Rémi* and Juan-Ting, Lin* and Larsson, Viktor and Oswald, Martin R. and Pollefeys, Marc},
    title = {SOLD²: Self-supervised Occlusion-aware Line Description and Detection},
    booktitle = {Computer Vision and Pattern Recognition (CVPR)},
    year = {2021},
}
Owner
Computer Vision and Geometry Lab
Computer Vision and Geometry Lab
“Data Augmentation for Cross-Domain Named Entity Recognition” (EMNLP 2021)

Data Augmentation for Cross-Domain Named Entity Recognition Authors: Shuguang Chen, Gustavo Aguilar, Leonardo Neves and Thamar Solorio This repository

<a href=[email protected]"> 18 Sep 10, 2022
FindFunc is an IDA PRO plugin to find code functions that contain a certain assembly or byte pattern, reference a certain name or string, or conform to various other constraints.

FindFunc: Advanced Filtering/Finding of Functions in IDA Pro FindFunc is an IDA Pro plugin to find code functions that contain a certain assembly or b

213 Dec 17, 2022
SurfEmb (CVPR 2022) - SurfEmb: Dense and Continuous Correspondence Distributions

SurfEmb SurfEmb: Dense and Continuous Correspondence Distributions for Object Pose Estimation with Learnt Surface Embeddings Rasmus Laurvig Haugard, A

Rasmus Haugaard 56 Nov 19, 2022
Tools for manipulating UVs in the Blender viewport.

UV Tool Suite for Blender A set of tools to make editing UVs easier in Blender. These tools can be accessed wither through the Kitfox - UV panel on th

35 Oct 29, 2022
(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain

Kaleido-BERT: Vision-Language Pre-training on Fashion Domain Mingchen Zhuge*, Dehong Gao*, Deng-Ping Fan#, Linbo Jin, Ben Chen, Haoming Zhou, Minghui

248 Dec 04, 2022
VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning

    VarCLR: Variable Representation Pre-training via Contrastive Learning New: Paper accepted by ICSE 2022. Preprint at arXiv! This repository contain

squaresLab 32 Oct 24, 2022
A highly modular PyTorch framework with a focus on Neural Architecture Search (NAS).

UniNAS A highly modular PyTorch framework with a focus on Neural Architecture Search (NAS). under development (which happens mostly on our internal Gi

Cognitive Systems Research Group 19 Nov 23, 2022
source code for 'Finding Valid Adjustments under Non-ignorability with Minimal DAG Knowledge' by A. Shah, K. Shanmugam, K. Ahuja

Source code for "Finding Valid Adjustments under Non-ignorability with Minimal DAG Knowledge" Reference: Abhin Shah, Karthikeyan Shanmugam, Kartik Ahu

Abhin Shah 1 Jun 03, 2022
LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021 We propose a cross encoder model (LTR_CrossEncoder) for information retrieval, re-retrie

Hieu Duong 7 Jan 12, 2022
[ICRA 2022] CaTGrasp: Learning Category-Level Task-Relevant Grasping in Clutter from Simulation

This is the official implementation of our paper: Bowen Wen, Wenzhao Lian, Kostas Bekris, and Stefan Schaal. "CaTGrasp: Learning Category-Level Task-R

Bowen Wen 199 Jan 04, 2023
Minimal implementation of PAWS (https://arxiv.org/abs/2104.13963) in TensorFlow.

PAWS-TF 🐾 Implementation of Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments with Support Samples (PAWS)

Sayak Paul 43 Jan 08, 2023
Pytorch implementation of Integrating Tree Path in Transformer for Code Representation

This is an official Pytorch implementation of the approaches proposed in: Han Peng, Ge Li, Wenhan Wang, Yunfei Zhao, Zhi Jin “Integrating Tree Path in

Han Peng 16 Dec 23, 2022
Aircraft design optimization made fast through modern automatic differentiation

Aircraft design optimization made fast through modern automatic differentiation. Plug-and-play analysis tools for aerodynamics, propulsion, structures, trajectory design, and much more.

Peter Sharpe 394 Dec 23, 2022
A deep neural networks for images using CNN algorithm.

Example-CNN-Project This is a simple project showing how to implement deep neural networks using CNN algorithm. The dataset is taken from this link: h

Mohammad Amin Dadgar 3 Sep 16, 2022
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

Build Type Linux MacOS Windows Build Status OpenPose has represented the first real-time multi-person system to jointly detect human body, hand, facia

25.7k Jan 09, 2023
PyTorch implementation of a Real-ESRGAN model trained on custom dataset

Real-ESRGAN PyTorch implementation of a Real-ESRGAN model trained on custom dataset. This model shows better results on faces compared to the original

Sber AI 160 Jan 04, 2023
Implementation of the state of the art beat-detection, downbeat-detection and tempo-estimation model

The ISMIR 2020 Beat Detection, Downbeat Detection and Tempo Estimation Model Implementation. This is an implementation in TensorFlow to implement the

Koen van den Brink 1 Nov 12, 2021
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

ONNX msg_chn_wacv20 depth completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20 model in

Ibai Gorordo 19 Oct 22, 2022
code for `Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation`

Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation (CVPR 2021) Introduction PBR is a conceptually simple yet effective

H.Chen 143 Jan 05, 2023
Code & Models for Temporal Segment Networks (TSN) in ECCV 2016

Temporal Segment Networks (TSN) We have released MMAction, a full-fledged action understanding toolbox based on PyTorch. It includes implementation fo

1.4k Jan 01, 2023