Towards Interpretable Deep Metric Learning with Structural Matching

Last update: Nov 11, 2022

Overview

DIML

Created by Wenliang Zhao*, Yongming Rao*, Ziyi Wang, Jiwen Lu, Jie Zhou

This repository contains PyTorch implementation for paper Towards Interpretable Deep Metric Learning with Structural Matching (ICCV 2021).

We present a deep interpretable metric learning (DIML) that adopts a structural matching strategy to explicitly aligns the spatial embeddings by computing an optimal matching flow between feature maps of the two images. Our method enables deep models to learn metrics in a more human-friendly way, where the similarity of two images can be decomposed to several part-wise similarities and their contributions to the overall similarity. Our method is model-agnostic, which can be applied to off-the-shelf backbone networks and metric learning methods.

[arXiv]

Usage

Requirement

python3
PyTorch 1.7

Dataset Preparation

Please follow the instruction in RevisitDML to download the datasets and put all the datasets in data folder. The structure should be:

data
├── cars196
│   └── images
├── cub200
│   └── images
└── online_products
    ├── images
    └── Info_Files

Training & Evaluation

To train the baseline models, run the scripts in scripts/baselines. For example:

CUDA_VISIBLE_DEVICES=0 ./script/baselines/cub_runs.sh

The checkpoints are saved in Training_Results folder.

To test the baseline models with our proposed DIML, first edit the checkpoint paths in test_diml.py, then run

CUDA_VISIBLE_DEVICES=0 ./scripts/diml/test_diml.sh cub200

The results will be written to test_results/test_diml_<dataset>.csv in CSV format.

You can also incorporate DIML into the training objectives. We provide two examples which apply DIML to Margin and Multi-Similarity loss. To train DIML models, run

# ./scripts/diml/train_diml.sh <dataset> <batch_size> <loss> <num_epochs>
# where loss could be margin_diml or multisimilarity_diml
# e.g.
CUDA_VISIBLE_DEVICES=0 ./scripts/diml/train_diml.sh cub200 112 margin_diml 150

Acknowledgement

The code is based on RevisitDML.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{zhao2021towards,
  title={Towards Interpretable Deep Metric Learning with Structural Matching},
  author={Zhao, Wenliang and Rao, Yongming and Wang, Ziyi and Lu, Jiwen and Zhou, Jie},
  booktitle={ICCV},
  year={2021}
}

Towards Interpretable Deep Metric Learning with Structural Matching

Related tags

Overview

DIML

Usage

Requirement

Dataset Preparation

Training & Evaluation

Acknowledgement

Citation

Owner

Wenliang Zhao

Extracting and filtering paraphrases by bridging natural language inference and paraphrasing

yolox_backbone is a deep-learning library and is a collection of YOLOX Backbone models.

Minimisation of a negative log likelihood fit to extract the lifetime of the D^0 meson (MNLL2ELDM)

Causal-Adversarial-Instruments - PyTorch Implementation for Developing Library of Investigating Adversarial Examples on A Causal View by Instruments

Orthogonal Over-Parameterized Training

Unified learning approach for egocentric hand gesture recognition and fingertip detection

Genpass - A Passwors Generator App With Python3

PClean: A Domain-Specific Probabilistic Programming Language for Bayesian Data Cleaning

Image augmentation library in Python for machine learning.

PaddleBoBo是基于PaddlePaddle和PaddleSpeech、PaddleGAN等开发套件的虚拟主播快速生成项目

PASSL包含 SimCLR，MoCo，BYOL，CLIP等基于对比学习的图像自监督算法以及 Vision-Transformer，Swin-Transformer，BEiT，CVT，T2T，MLP_Mixer等视觉Transformer算法

iBOT: Image BERT Pre-Training with Online Tokenizer

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR 2022)

Molecular AutoEncoder in PyTorch

Pmapper is a super-resolution and deconvolution toolkit for python 3.6+

StackRec: Efficient Training of Very Deep Sequential Recommender Models by Iterative Stacking

Code for ViTAS_Vision Transformer Architecture Search

Pre-trained models for a Cascaded-FCN in caffe and tensorflow that segments

Manage the availability of workspaces within Frappe/ ERPNext (sidebar) based on user-roles

Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"