PyTorch implementation of SIFT descriptor

Last update: Dec 24, 2022

Overview

This is an differentiable pytorch implementation of SIFT patch descriptor. It is very slow for describing one patch, but quite fast for batch. It can be used for descriptop-based learning shape of affine feature.

UPD 08/2019 : pytorch-sift is added to kornia and available by kornia.features.SIFTDescriptor

There are different implementations of the SIFT on the web. I tried to match Michal Perdoch implementation, which gives high quality features for image retrieval CVPR2009. However, on planar datasets, it is inferior to vlfeat implementation. The main difference is gaussian weighting window parameters, so I have made a vlfeat-like version too. MP version weights patch center much more (see image below, left) and additionally crops everything outside the circular region. Right is vlfeat version

descriptor_mp_mode = SIFTNet(patch_size = 65,
                        sigma_type= 'hesamp',
                        masktype='CircularGauss')

descriptor_vlfeat_mode = SIFTNet(patch_size = 65,
                        sigma_type= 'vlfeat',
                        masktype='Gauss')

Results:

OPENCV-SIFT - mAP 
   Easy     Hard      Tough     mean
-------  -------  ---------  -------
0.47788  0.20997  0.0967711  0.26154

VLFeat-SIFT - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.466584  0.203966  0.0935743  0.254708

PYTORCH-SIFT-VLFEAT-65 - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.472563  0.202458  0.0910371  0.255353

NUMPY-SIFT-VLFEAT-65 - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.449431  0.197918  0.0905395  0.245963

PYTORCH-SIFT-MP-65 - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.430887  0.184834  0.0832707  0.232997

NUMPY-SIFT-MP-65 - mAP 
    Easy     Hard      Tough      mean
--------  -------  ---------  --------
0.417296  0.18114  0.0820582  0.226832

Speed:

0.00246 s per 65x65 patch - numpy SIFT
0.00028 s per 65x65 patch - C++ SIFT
0.00074 s per 65x65 patch - CPU, 256 patches per batch
0.00038 s per 65x65 patch - GPU (GM940, mobile), 256 patches per batch
0.00038 s per 65x65 patch - GPU (GM940, mobile), 256 patches per batch

If you use this code for academic purposes, please cite the following paper:

@InProceedings{AffNet2018,
    title = {Repeatability Is Not Enough: Learning Affine Regions via Discriminability},
    author = {Dmytro Mishkin, Filip Radenovic, Jiri Matas},
    booktitle = {Proceedings of ECCV},
    year = 2018,
    month = sep
}

PyTorch implementation of SIFT descriptor

Related tags

Overview

Owner

Dmytro Mishkin

Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Smart edu-autobooking - Johnson @ DMI-UNICT study room self-booking system

Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Modular Gaussian Processes

This repository contains the files for running the Patchify GUI.

Created as part of CS50 AI's coursework. This AI makes use of knowledge entailment to calculate the best probabilities to win Minesweeper.

Machine Learning toolbox for Humans

Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV

Face2webtoon - Despite its importance, there are few previous works applying I2I translation to webtoon.

StyleSwin: Transformer-based GAN for High-resolution Image Generation

Autoencoders pretraining using clustering

git《Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser》(2021) GitHub: [fig5]

MVFNet: Multi-View Fusion Network for Efficient Video Recognition (AAAI 2021)

Few-shot Neural Architecture Search

Watch faces morph into each other with StyleGAN 2, StyleGAN, and DCGAN!

Denoising Diffusion Probabilistic Models

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [2021]

PyTorch code for the NAACL 2021 paper "Improving Generation and Evaluation of Visual Stories via Semantic Consistency"

Example scripts for the detection of lanes using the ultra fast lane detection model in Tensorflow Lite.

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation