Source code for our paper "Do Not Trust Prediction Scores for Membership Inference Attacks"

Overview

Do Not Trust Prediction Scores for Membership Inference Attacks

False-Positive Examples

Abstract: Membership inference attacks (MIAs) aim to determine whether a specific sample was used to train a predictive model. Knowing this may indeed lead to a privacy breach. Arguably, most MIAs, however, make use of the model's prediction scores---the probability of each output given some input---following the intuition that the trained model tends to behave differently on its training data. We argue that this is a fallacy for many modern deep network architectures, e.g., ReLU type neural networks produce almost always high prediction scores far away from the training data. Consequently, MIAs will miserably fail since this behavior leads to high false-positive rates not only on known domains but also on out-of-distribution data and implicitly acts as a defense against MIAs. Specifically, using generative adversarial networks, we are able to produce a potentially infinite number of samples falsely classified as part of the training data. In other words, the threat of MIAs is overestimated and less information is leaked than previously assumed. Moreover, there is actually a trade-off between the overconfidence of classifiers and their susceptibility to MIAs: the more classifiers know when they do not know, making low confidence predictions far away from the training data, the more they reveal the training data.
Arxiv Preprint (PDF)

Membership Inference Attacks

Membership Inference Attacks


Membership Inference Attack Preparation Process

In a general MIA setting, as usually assumed in the literature, an adversary is given an input x following distribution D and a target model which was trained on a training set with size S_train consisting of samples from D. The adversary is then facing the problem to identify whether a given x following D was part of the training set S_train. To predict the membership of x, the adversary creates an inference model h. In score-based MIAs, the input to h is the prediction score vector produced by the target model on sample x (see first figure above). Since MIAs are binary classification problems, precision, recall and false-positive rate (FPR) are used as attack evaluation metrics.

All MIAs exploit a difference in the behavior of the target model on seen and unseen data. Most attacks in the literature follow Shokri et al. and train so-called shadow models shadow models on a disjoint dataset S_shadow drawn from the same distribution D as S_train. The shadow model is used to mimic the behavior of the target model and adjust parameters of h, such as threshold values or model weights. Note that the membership status for inputs to the shadow models are known to the adversary (see second figure above).

Setup and Run Experiments

Setup StyleGAN2-ADA

To recreate our Fake datasets containing synthetic CIFAR-10 and Stanford Dog images, you need to clone the official StyleGAN-2-Pytorch repo into the folder datasets.

cd datasets
git clone https://github.com/NVlabs/stylegan2-ada-pytorch.git
rm -r --force stylegan2-ada-pytorch/.git/

You can also safely remove all folders in the /datasets/stylegan2-ada-pytorch folder but /dnnlib and /torch_utils.

Setup Docker Container

To build the Docker container run the following script:

./docker_build.sh -n confidence_mi

To start the docker container run the following command from the project's root:

docker run --rm --shm-size 16G --name my_confidence_mi --gpus '"device=0"' -v $(pwd):/workspace/confidences -it confidence_mi bash

Download Trained Models

We provide our trained models on which we performed our experiments. To automatically download and extract the files use the following command:

bash download_pretrained_models.sh

To manually download single models, please visit https://hessenbox.tu-darmstadt.de/getlink/fiBg5znMtAagRe58sCrrLtyg/pretrained_models.

Reproduce Results from the Paper

All our experiments based on CIFAR-10 and Stanford Dogs can be reproduced using the pre-trained models by running the following scripts:

python experiments/cifar10_experiments.py
python experiments/stanford_dogs_experiments.py

If you want to train the models from scratch, the following commands can be used:

python experiments/cifar10_experiments.py --train
python experiments/stanford_dogs_experiments.py --train --pretrained

We use command line arguments to specify the hyperparameters of the training and attacking process. Default values correspond to the parameters used for training the target models as stated in the paper. The same applies for the membership inference attacks. To train models with label smoothing, L2 or LLLA, run the experiments with --label_smoothing, --weight_decay or --llla. We set the seed to 42 (default value) for all experiments. For further command line arguments and details, please refer to the python files.

Attack results will be stored in csv files at /experiments/results/{MODEL_ARCH}_{DATASET_NAME}_{MODIFIERS}_attack_results.csv and state precision, recall, fpr and mmps values for the various input datasets and membership inference attacks. Results for training the target and shadow models will be stored in the first column at /experiments/results/{MODEL_ARCH}_{DATASET_NAME}_{MODIFIERS}_performance_results.csv. They state the training and test accuracy, as well as the ECE.

Datasets

All data is required to be located in /data/. To recreate the Fake datasets using StyleGAN2-ADA to generate CIFAR-10 and dog samples, use /datasets/fake_cifar10.py and /datasets/fake_dogs.py. For example, Fake Dogs samples are located at /data/fake_afhq_dogs/Images after generation. If the files are missing or corrupted (checked by MD5 checksum), the images will be regenerated to restore the identical datasets used in the paper. This process will be automatically called when running one of the experiments. We use various datasets in our experiments. The following figure gives a short overview over the content and visual styles of the datasets.

Membership Inference Attacks

Citation

If you build upon our work, please don't forget to cite us.

@misc{hintersdorf2021trust,
      title={Do Not Trust Prediction Scores for Membership Inference Attacks}, 
      author={Dominik Hintersdorf and Lukas Struppek and Kristian Kersting},
      year={2021},
      eprint={2111.09076},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Implementation Credits

Some of our implementations rely on other repos. We want to thank the authors for making their code publicly available. For license details refer to the corresponding files in our repo. For more details on the specific functionality, please visit the corresponding repos.

Owner
[email protected]
Machine Learning Group at TU Darmstadt
<a href=[email protected]">
Rational Activation Functions - Replacing Padé Activation Units

Rational Activations - Learnable Rational Activation Functions First introduce as PAU in Padé Activation Units: End-to-end Learning of Activation Func

<a href=[email protected]"> 38 Nov 22, 2022
PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection?

PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park*, Rares Ambrus*, Vitor Guizilini, Jie Li, and Adrien Gaidon.

Toyota Research Institute - Machine Learning 364 Dec 27, 2022
A MNIST-like fashion product database. Benchmark

Fashion-MNIST Table of Contents Why we made Fashion-MNIST Get the Data Usage Benchmark Visualization Contributing Contact Citing Fashion-MNIST License

Zalando Research 10.5k Jan 08, 2023
using STGCN to achieve egg classification task

EEG Classification   The task requires us to classify electroencephalography(EEG) into six categories, including human body, human face, animal body,

4 Jun 13, 2022
MM1 and MMC Queue Simulation using python - Results and parameters in excel and csv files

implementation of MM1 and MMC Queue on randomly generated data and evaluate simulation results then compare with analytical results and draw a plot curve for them, simulate some integrals and compare

Mohamadreza Rezaei 1 Jan 19, 2022
Free course that takes you from zero to Reinforcement Learning PRO 🦸🏻‍🦸🏽

The Hands-on Reinforcement Learning course 🚀 From zero to HERO 🦸🏻‍🦸🏽 Out of intense complexities, intense simplicities emerge. -- Winston Churchi

Pau Labarta Bajo 260 Dec 28, 2022
PyTorch original implementation of Cross-lingual Language Model Pretraining.

XLM NEW: Added XLM-R model. PyTorch original implementation of Cross-lingual Language Model Pretraining. Includes: Monolingual language model pretrain

Facebook Research 2.7k Dec 27, 2022
Extracting and filtering paraphrases by bridging natural language inference and paraphrasing

nli2paraphrases Source code repository accompanying the preprint Extracting and filtering paraphrases by bridging natural language inference and parap

Matej Klemen 1 Mar 09, 2022
The pure and clear PyTorch Distributed Training Framework.

The pure and clear PyTorch Distributed Training Framework. Introduction Requirements and Usage Dependency Dataset Basic Usage Slurm Cluster Usage Base

WILL LEE 208 Dec 20, 2022
A deep learning framework for historical document image analysis

DIVA-DAF Description A deep learning framework for historical document image analysis. How to run Install dependencies # clone project git clone https

9 Aug 04, 2022
WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

Chair for Sys­tems Se­cu­ri­ty 27 Dec 22, 2022
Elevation Mapping on GPU.

Elevation Mapping cupy Overview This is a ros package of elevation mapping on GPU. Code are written in python and uses cupy for GPU calculation. * pla

Robotic Systems Lab - Legged Robotics at ETH Zürich 183 Dec 19, 2022
Pytorch Lightning Implementation of SC-Depth Methods.

SC_Depth_pl: This is a pytorch lightning implementation of SC-Depth (V1, V2) for self-supervised learning of monocular depth from video. In the V1 (IJ

JiaWang Bian 216 Dec 30, 2022
GPT, but made only out of gMLPs

GPT - gMLP This repository will attempt to crack long context autoregressive language modeling (GPT) using variations of gMLPs. Specifically, it will

Phil Wang 80 Dec 01, 2022
Code for Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations

Implementation for Iso-Points (CVPR 2021) Official code for paper Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations paper |

Yifan Wang 66 Nov 08, 2022
Source code for GNN-LSPE (Graph Neural Networks with Learnable Structural and Positional Representations)

Graph Neural Networks with Learnable Structural and Positional Representations Source code for the paper "Graph Neural Networks with Learnable Structu

Vijay Prakash Dwivedi 180 Dec 22, 2022
Official PyTorch implementation of "Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble" (NeurIPS'21)

Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble This is the code for reproducing the results of the paper Uncertainty-Bas

43 Nov 23, 2022
PyTorch Live is an easy to use library of tools for creating on-device ML demos on Android and iOS.

PyTorch Live is an easy to use library of tools for creating on-device ML demos on Android and iOS. With Live, you can build a working mobile app ML demo in minutes.

559 Jan 01, 2023
Source code for Task-Aware Variational Adversarial Active Learning

Contrastive Coding for Active Learning under Class Distribution Mismatch Official PyTorch implementation of ["Contrastive Coding for Active Learning u

27 Nov 23, 2022
Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Transformer-vocabulary-transfer Implementation of the paper "Fine-Tuning Transfo

LEYA 13 Nov 30, 2022