PyTorch implementation of SIFT descriptor

Last update: Dec 24, 2022

Overview

This is an differentiable pytorch implementation of SIFT patch descriptor. It is very slow for describing one patch, but quite fast for batch. It can be used for descriptop-based learning shape of affine feature.

UPD 08/2019 : pytorch-sift is added to kornia and available by kornia.features.SIFTDescriptor

There are different implementations of the SIFT on the web. I tried to match Michal Perdoch implementation, which gives high quality features for image retrieval CVPR2009. However, on planar datasets, it is inferior to vlfeat implementation. The main difference is gaussian weighting window parameters, so I have made a vlfeat-like version too. MP version weights patch center much more (see image below, left) and additionally crops everything outside the circular region. Right is vlfeat version

descriptor_mp_mode = SIFTNet(patch_size = 65,
                        sigma_type= 'hesamp',
                        masktype='CircularGauss')

descriptor_vlfeat_mode = SIFTNet(patch_size = 65,
                        sigma_type= 'vlfeat',
                        masktype='Gauss')

Results:

OPENCV-SIFT - mAP 
   Easy     Hard      Tough     mean
-------  -------  ---------  -------
0.47788  0.20997  0.0967711  0.26154

VLFeat-SIFT - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.466584  0.203966  0.0935743  0.254708

PYTORCH-SIFT-VLFEAT-65 - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.472563  0.202458  0.0910371  0.255353

NUMPY-SIFT-VLFEAT-65 - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.449431  0.197918  0.0905395  0.245963

PYTORCH-SIFT-MP-65 - mAP 
    Easy      Hard      Tough      mean
--------  --------  ---------  --------
0.430887  0.184834  0.0832707  0.232997

NUMPY-SIFT-MP-65 - mAP 
    Easy     Hard      Tough      mean
--------  -------  ---------  --------
0.417296  0.18114  0.0820582  0.226832

Speed:

0.00246 s per 65x65 patch - numpy SIFT
0.00028 s per 65x65 patch - C++ SIFT
0.00074 s per 65x65 patch - CPU, 256 patches per batch
0.00038 s per 65x65 patch - GPU (GM940, mobile), 256 patches per batch
0.00038 s per 65x65 patch - GPU (GM940, mobile), 256 patches per batch

If you use this code for academic purposes, please cite the following paper:

@InProceedings{AffNet2018,
    title = {Repeatability Is Not Enough: Learning Affine Regions via Discriminability},
    author = {Dmytro Mishkin, Filip Radenovic, Jiri Matas},
    booktitle = {Proceedings of ECCV},
    year = 2018,
    month = sep
}

PyTorch implementation of SIFT descriptor

Related tags

Overview

Owner

Dmytro Mishkin

A toolkit for controlling Euro Truck Simulator 2 with python to develop self-driving algorithms.

A simple Python configuration file operator.

Sound-guided Semantic Image Manipulation - Official Pytorch Code (CVPR 2022)

CAUSE: Causality from AttribUtions on Sequence of Events

Code repo for "Towards Interpretable Deep Networks for Monocular Depth Estimation" paper.

This project is for a Twitter bot that monitors a bird feeder in my backyard. Any detected birds are identified and posted to Twitter.

discovering subdomains, hidden paths, extracting unique links

Official Implementation of DE-DETR and DELA-DETR in "Towards Data-Efficient Detection Transformers"

REGTR: End-to-end Point Cloud Correspondences with Transformers

PyTorch reimplementation of Diffusion Models

Video Autoencoder: self-supervised disentanglement of 3D structure and motion

A fast and easy to use, moddable, Python based Minecraft server!

MBPO (paper: When to trust your model: Model-based policy optimization) in offline RL settings

EfficientDet (Scalable and Efficient Object Detection) implementation in Keras and Tensorflow

Codes for paper "KNAS: Green Neural Architecture Search"

Gapmm2: gapped alignment using minimap2 (align transcripts to genome)

Finetuner allows one to tune the weights of any deep neural network for better embeddings on search tasks

Data manipulation and transformation for audio signal processing, powered by PyTorch

Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation (ICCV 2021)

The implementation of 'Image synthesis via semantic composition'.