Hold me tight! Influence of discriminative features on deep network boundaries

This is the source code to reproduce the experiments of the NeurIPS 2020 paper "Hold me tight! Influence of discriminative features on deep network boundaries" by Guillermo Ortiz-Jimenez*, Apostolos Modas*, Seyed-Mohsen Moosavi-Dezfooli and Pascal Frossard.

Abstract

Important insights towards the explainability of neural networks reside in the characteristics of their decision boundaries. In this work, we borrow tools from the field of adversarial robustness, and propose a new perspective that relates dataset features to the distance of samples to the decision boundary. This enables us to carefully tweak the position of the training samples and measure the induced changes on the boundaries of CNNs trained on large-scale vision datasets. We use this framework to reveal some intriguing properties of CNNs. Specifically, we rigorously confirm that neural networks exhibit a high invariance to non-discriminative features, and show that very small perturbations of the training samples in certain directions can lead to sudden invariances in the orthogonal ones. This is precisely the mechanism that adversarial training uses to achieve robustness.

Dependencies

To run our code on a Linux machine with a GPU, install the Python packages in a fresh Anaconda environment:

$ conda env create -f environment.yml
$ conda activate hold_me_tight

Experiments

This repository contains code to reproduce the following experiments:

Train and compute the margin distribution on MNIST
Train and compute the margin distribution on Frequency-flipped MNIST
Train and compute the margin distribution on CIFAR10
Train and compute the margin distribution on Frequency-flipped CIFAR10
Train and compute the margin distribution on Low-Pass CIFAR10
Compute the margin distribution on Robust MNIST
Compute the margin distribution on Robust CIFAR10
Compute the margin distribution on Robust ImageNet
Compute the margin distribution on Frequency-flipped ImageNet

You can reproduce this experiments separately using their individual scripts, or have a look at the comprehensive Jupyter notebook.

Pretrained architectures

We also provide a set of pretrained models that we used in our experiments. The exact hyperparameters and settings can be found in the Supplementary material of the paper. All the models are publicly available and can be downloaded from here. In order to execute the scripts using the pretrained models, it is recommended to download them and save them under the Models/Pretrained/ directory.

Architecture	Dataset	Training method
LeNet	MNIST	Standard
ResNet18	MNIST	Standard
ResNet18	CIFAR10	Standard
VGG19	CIFAR10	Standard
DenseNet121	CIFAR10	Standard
LeNet	Flipped MNIST	Standard + Frequency flip
ResNet18	Flipped MNIST	Standard + Frequency flip
ResNet18	Flipped CIFAR10	Standard + Frequency flip
VGG19	Flipped CIFAR10	Standard + Frequency flip
DenseNet121	Flipped CIFAR10	Standard + Frequency flip
ResNet50	Flipped ImageNet	Standard + Frequency flip
ResNet18	Low-pass CIFAR10	Standard + Low-pass filtering
VGG19	Low-pass CIFAR10	Standard + Low-pass filtering
DenseNet121	Low-pass CIFAR10	Standard + Low-pass filtering
Robust LeNet	MNIST	L2 PGD adversarial training (eps = 2)
Robust ResNet18	MNIST	L2 PGD adversarial training (eps = 2)
Robust ResNet18	CIFAR10	L2 PGD adversarial training (eps = 1)
Robust VGG19	CIFAR10	L2 PGD adversarial training (eps = 1)
Robust DenseNet121	CIFAR10	L2 PGD adversarial training (eps = 1)
Robust ResNet50	ImageNet	L2 PGD adversarial training (eps = 3) (copied from here)
Robust LeNet	Flipped MNIST	L2 PGD adversarial training (eps = 2) with Dykstra projection + Frequency flip
Robust ResNet18	Flipped MNIST	L2 PGD adversarial training (eps = 2) with Dykstra projection + Frequency flip
Robust ResNet18	Flipped CIFAR10	L2 PGD adversarial training (eps = 1) with Dykstra projection + Frequency flip
Robust VGG19	Flipped CIFAR10	L2 PGD adversarial training (eps = 1) with Dykstra projection + Frequency flip
Robust DenseNet121	Flipped CIFAR10	L2 PGD adversarial training (eps = 1) with Dykstra projection + Frequency flip

Reference

If you use this code, or some of the attached models, please cite the following paper:

@InCollection{OrtizModasHMT2020,
  TITLE = {{Hold me tight! Influence of discriminative features on deep network boundaries}},
  AUTHOR = {{Ortiz-Jimenez}, Guillermo and {Modas}, Apostolos and {Moosavi-Dezfooli}, Seyed-Mohsen and Frossard, Pascal},
  BOOKTITLE = {Advances in Neural Information Processing Systems 34},
  MONTH = dec,
  YEAR = {2020}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Images		Images
Models/Pretrained		Models/Pretrained
model_classes		model_classes
scripts		scripts
.gitignore		.gitignore
Hold_Me_Tight.ipynb		Hold_Me_Tight.ipynb
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
graphics.py		graphics.py
utils.py		utils.py
utils_dct.py		utils_dct.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Images

Images

Models/Pretrained

Models/Pretrained

model_classes

model_classes

scripts

scripts

.gitignore

.gitignore

Hold_Me_Tight.ipynb

Hold_Me_Tight.ipynb

LICENSE

LICENSE

README.md

README.md

environment.yml

environment.yml

graphics.py

graphics.py

utils.py

utils.py

utils_dct.py

utils_dct.py

Repository files navigation

Hold me tight! Influence of discriminative features on deep network boundaries

Abstract

Dependencies

Experiments

Pretrained architectures

Reference

About

Releases

Packages

Contributors 2

Languages

License

LTS4/hold-me-tight

Folders and files

Latest commit

History

Repository files navigation

Hold me tight! Influence of discriminative features on deep network boundaries

Abstract

Dependencies

Experiments

Pretrained architectures

Reference

About

Resources

License

Stars

Watchers

Forks

Languages