Source code for our paper "Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash"

Overview

Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash

Abstract: Apple recently revealed its deep perceptual hashing system NeuralHash to detect child sexual abuse material (CSAM) on user devices before files are uploaded to its iCloud service. Public criticism quickly arose regarding the protection of user privacy and the system's reliability. In this paper, we present the first comprehensive empirical analysis of deep perceptual hashing based on NeuralHash. Specifically, we show that current deep perceptual hashing may not be robust. An adversary can manipulate the hash values by applying slight changes in images, either induced by gradient-based approaches or simply by performing standard image transformations, forcing or preventing hash collisions. Such attacks permit malicious actors easily to exploit the detection system: from hiding abusive material to framing innocent users, everything is possible. Moreover, using the hash values, inferences can still be made about the data stored on user devices. In our view, based on our results, deep perceptual hashing in its current form is generally not ready for robust client-side scanning and should not be used from a privacy perspective.
Arxiv Preprint (PDF)

We want to clearly make the following two statements regarding our research:

  • We explicitly condemn the creation, possession, and distribution of child pornography and abusive material and strongly support the prosecution of related crimes. With this work, we in no way intend to provide instructions on how to bypass or manipulate CSAM filters. In turn, we want to initiate a well-founded discussion about the effectiveness and the general application of client-side scanning based on deep perceptual hashing.
  • We have no intention to harm Apple Inc. itself or their intention to stop the distribution of CSAM material. NeuralHash merely forms the empirical basis of our work to critically examine perceptual hashing methods and the risks they may induce in real-world scenarios.

Perceptual Hashing and NeuralHash

Neural Hash Architecture

Perceptual hashing algorithms aim to compute similar hashes for images with similar contents and more divergent hashes for different contents. Deep perceptual hashing relies on deep neural networks to first extract unique features from an image and then compute a hash value based on these features. Perceptual hashing algorithms usually consist of two components. First, a shared feature extractor M extracts visual features from an image x and encodes them in a feature vector z. This resulting feature vector z is an abstract numeric interpretation of the image's characteristic features.

Next, locality-sensitive hashing (LSH) is used to assign close feature vectors to buckets with similar hash values. Among other LSH methods, random projection can be used to quickly convert the extracted features into a bit representation. For each of the k bits, a (random) hyperplane is defined in the hashing matrix B. Each hash bit h_i is set by checking on which side of the i-th hyperplane feature vector z lies. The result is a binary hash vector containing k bits.

Apple recently announced its NeuralHash system, a deep perceptual hashing algorithm for client-side content scanning on iPhones and Macs. NeuralHash focuses on identifying CSAM (child sexual abuse material) content in user files uploaded to Apple's iCloud service. For more details on NeuralHash, visit the official technical summary.

Setup and Preparation

Setup Docker Container

To build the Docker container (for rootful Docker) run the following script:

docker build -t hashing_attacks --build-arg USER_ID=$(id -u) --build-arg GROUP_ID=$(id -g) .

To build the Docker container (for rootless Docker) run the following script:

docker build -t hashing_attacks -f rootless.Dockerfile .

To start the docker container run the following command from the project's root:

docker run --rm --shm-size 16G --name my_hashing_attacks --gpus '"device=0"' -v $(pwd):/code -it hashing_attacks bash

Extract NeuralHash Model and Convert into PyTorch

To extract the NeuralHash model from a recent macOS or iOS build, please follow the conversion guide provided by AppleNeuralHash2ONNX. We will not provide any NeuralHash files or models, neither this repo nor by request. After extracting the onnx model, put the file model.onnx into /models/. Further, put the extracted Core ML file neuralhash_128x96_seed1.dat into /models/coreml_model.

To convert the onnx model into PyTorch, run the following command after creating folder models and putting model.onnx into it. The converted files will be stored at models/model.pth:

python utils/onnx2pytorch.py

Run the Attacks

General remarks: We provide experimental setup and hyperparameters for each attack in our paper, in particular in Appendix A. So please visit the paper for further instructions and technical details of the attacks. Computed metrics for our attacks will usually be written in .txt files into the folder /logs, which is created automatically when running an attack.

Adversary 1 - Hash Collision Attacks

Hash Collision Attack Example

In our first adversarial setting, we investigate the creation of hash collisions. We perturb images so that their hashes match predefined target hashes.

The first step to perform the attack is to create a surrogate hash database from a data folder. For this, run the following script and replace DATASET_FOLDER with a folder containing images:

python utils/compute_dataset_hashes.py --source=DATASET_FOLDER

The computed hashed will be stored in a file hashes.csv in the folder DATASET_FOLDER.

We now can perform the collision attack using the computed hashes as possible targets. Prepare the images to alter in INPUT_FOLDER and run

python adv1_collision_attack.py --source=INPUT_FOLDER --target_hashset=DATASET_FOLDER/hashes.csv

Please note that depending on the number of images and the complexity of the optimization, this attack might run for some time. To store the manipulated images, provide the argument --output_folder=OUTPUT_FOLDER and provide a link to an (empty) folder. For further parameters, e.g. learning rate and optimizer, you can run python adv1_collision_attack.py --help. Images on which a collision was not possible will not be stated in the corresponding log file.

We performed the experiments in our paper with default parameters on the first 10,000 samples from the ImageNet test split and used the Stanford Dogs dataset to compute the surrogate hash database. Both datasets overlap in two images, which we then removed from the results to avoid biases.

To create images with our StyleGAN2-based approach, first clone thestylegan2-ada-pytorch repo into the project root with

git clone https://github.com/NVlabs/stylegan2-ada-pytorch

The StyleGAN2 repo provides various pre-trained models. To download them, run

cd stylegan2-ada-pytorch
wget https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/DATASET.pkl

and replace DATASET with one of [ffhq, metfaces, afhqcat, afhqdog, afhqwild, cifar10, brecahad].

Then run the following script:

python adv1_gan_attack.py --pkl_file=stylegan2-ada-pytorch/DATASET.pkl --target_hashset=DATASET_FOLDER/HASHES.csv

Replace DATASET with the same value as used to download the pickle file. --target_hashset should link to a .csv file of a hash database, as computed with compute_dataset_hashes.py. Note that this attack is more experimentally and might require some fine-tuning of the learning rate and optimizer to achieve good results for different datasets.

Adversary 2 - Gradient-Based Evasion Attacks

Evasion Attacks Example

Our second adversarial setting investigates the robustness of NeuralHash against gradient-based image perturbations. The attacks try to change the hash of any image by perturbating it. This is also called a detection evasion attack.

To run the Standard attack, which adds no pixel restrictions to the optimization, run the following script:

python adv2_evasion_attack.py --source=INPUT_FOLDER

Prepare the images to alter in INPUT_FOLDER. To store the manipulated images, provide the argument --output_folder=OUTPUT_FOLDER and provide a link to a folder. To perform the Edges-Only attack, just att the flag --edges_only.

To run the Few-Pixels attack, run the following script:

python adv2_few_pixels_attack.py --source=INPUT_FOLDER

The optional parameters are nearly the same for both scripts. Again, call the scripts with --help to display all options with a short description.

Images on which a collision was not possible will not be stated in the corresponding log files of the attacks.

We performed the experiments in our paper with default parameters on the first 10,000 samples from the ImageNet test split.

Adversary 3 - Gradient-Free Evasion Attacks

Robustness Examples

Our third adversarial setting measures the robustness of NeuralHash against gradient-free, standard image transformations as provided by standard image editors. The attack investigates the following transformations with varying parameters independently: translation, rotation, center cropping, downsizing, flipping, changes in the HSV color space, contrast changes, and JPEG compression.

To run the analysis, run the following script:

python adv3_robustness_check.py --dataset=DATASET

Replace DATASET with on of ['stl10', 'cifar10', 'cifar100', 'imagenet_test', 'imagenet_train', 'imagenet_val']. For using ImageNet, please put the corresponding *.tar.gz file into /data/ILSVRC2012. The other datasets are downloaded and extracted automatically.

The script provides various options to set the transformation parameters. Call the script with --help to display all available options.

We performed the experiments in our paper on the 1,281,167 samples from the ImageNet training split. To evaluate the results, please run the adv3_evaluation.ipynb notebook.

Adversary 4 - Hash Information Extraction

Classification Categorization
04.34% ± 0.046% 08.76% ± 0.237%
12.03% ± 0.090% 25.85% ± 0.423%
17.75% ± 0.182% 38.59% ± 0.728%

In our last adversarial setting, we want to investigate whether a hash value leaks information about its corresponding image. For this, we need to first compute the hashes of all samples in the dataset and then train a simple classifier that takes a 96-bit vector as input.

We performed the experiments in our paper on ImageNet samples from the ImageNet train and validation split. Please download the files ILSVRC2012_devkit_t12.tar.gz, ILSVRC2012_img_train.tar, ILSVRC2012_img_val.tar and put them into the folder data/ILSVRC2012/. Then run the following script to run the attack:

python adv4_information_extraction.py 

Various training and model parameters such as learning rate, optimizer, dropout probability, and weight decay can be set. Call the script with --help to display all available options.

To evaluate the results, please run the adv4_evaluation.ipynb notebook.

Citation

If you build upon our work, please don't forget to cite us.

@misc{struppek2021learning,
      title={Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash}, 
      author={Lukas Struppek and Dominik Hintersdorf and Daniel Neider and Kristian Kersting},
      year={2021},
      eprint={2111.06628},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Implementation Credits

Some of our implementations rely on or are inspired by other repos. We want to thank the authors for making their code publicly available.

Owner
[email protected]
Machine Learning Group at TU Darmstadt
<a href=[email protected]">
Speckle-free Holography with Partially Coherent Light Sources and Camera-in-the-loop Calibration

Speckle-free Holography with Partially Coherent Light Sources and Camera-in-the-loop Calibration Project Page | Paper Yifan Peng*, Suyeon Choi*, Jongh

Stanford Computational Imaging Lab 19 Dec 11, 2022
Fast, general, and tested differentiable structured prediction in PyTorch

Fast, general, and tested differentiable structured prediction in PyTorch

HNLP 1.1k Dec 16, 2022
A Python library for common tasks on 3D point clouds

Point Cloud Utils (pcu) - A Python library for common tasks on 3D point clouds Point Cloud Utils (pcu) is a utility library providing the following fu

Francis Williams 622 Dec 27, 2022
Variational autoencoder for anime face reconstruction

VAE animeface Variational autoencoder for anime face reconstruction Introduction This repository is an exploratory example to train a variational auto

Minzhe Zhang 2 Dec 11, 2021
Official implementation for the paper "Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection"

Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection PyTorch code release of the paper "Attentive Prototypes for Sour

Deepti Hegde 23 Oct 17, 2022
Convert openmmlab (not only mmdetection) series model to tensorrt

MMDet to TensorRT This project aims to convert the mmdetection model to TensorRT model end2end. Focus on object detection for now. Mask support is exp

JinTian 4 Dec 17, 2021
A collection of resources on GAN Inversion.

This repo is a collection of resources on GAN inversion, as a supplement for our survey

Bib-parser - Convenient script to parse .bib files with the ACM Digital Library like metadata

Bib Parser Convenient script to parse .bib files with the ACM Digital Library li

Mehtab Iqbal (Shahan) 1 Jan 26, 2022
Code + pre-trained models for the paper Keeping Your Eye on the Ball Trajectory Attention in Video Transformers

Motionformer This is an official pytorch implementation of paper Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers. In this rep

Facebook Research 192 Dec 23, 2022
Global Rhythm Style Transfer Without Text Transcriptions

Global Prosody Style Transfer Without Text Transcriptions This repository provides a PyTorch implementation of AutoPST, which enables unsupervised glo

Kaizhi Qian 193 Dec 30, 2022
Code for this paper The Lottery Ticket Hypothesis for Pre-trained BERT Networks.

The Lottery Ticket Hypothesis for Pre-trained BERT Networks Code for this paper The Lottery Ticket Hypothesis for Pre-trained BERT Networks. [NeurIPS

VITA 122 Dec 14, 2022
System Design course at HSE (2021)

System Design course at HSE (2021) Wiki-страница курса Структура репозитория: slides - директория с презентациями с занятий tasks - материалы для выпо

22 Dec 25, 2022
Neural Turing Machines (NTM) - PyTorch Implementation

PyTorch Neural Turing Machine (NTM) PyTorch implementation of Neural Turing Machines (NTM). An NTM is a memory augumented neural network (attached to

Guy Zana 519 Dec 21, 2022
Implementation based on Paper - Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

Implementation based on Paper - Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

HamasKhan 3 Jul 08, 2022
Re-TACRED: Addressing Shortcomings of the TACRED Dataset

Re-TACRED Re-TACRED: Addressing Shortcomings of the TACRED Dataset

George Stoica 40 Dec 10, 2022
The official start-up code for paper "FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark."

FFA-IR The official start-up code for paper "FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark." The framework is inheri

Mingjie 28 Dec 16, 2022
Benchmark for the generalization of 3D machine learning models across different remeshing/samplings of a surface.

Discretization Robust Correspondence Benchmark One challenge of machine learning on 3D surfaces is that there are many different representations/sampl

Nicholas Sharp 10 Sep 30, 2022
Optimal Camera Position for a Practical Application of Gaze Estimation on Edge Devices,

Optimal Camera Position for a Practical Application of Gaze Estimation on Edge Devices, Linh Van Ma, Tin Trung Tran, Moongu Jeon, ICAIIC 2022 (The 4th

Linh 11 Oct 10, 2022
StarGAN2 for practice

StarGAN2 for practice This version of StarGAN2 (coined as 'Post-modern Style Transfer') is intended mostly for fellow artists, who rarely look at scie

vadim epstein 87 Sep 24, 2022
Repo for flood prediction using LSTMs and HAND

Abstract Every year, floods cause billions of dollars’ worth of damages to life, crops, and property. With a proper early flood warning system in plac

1 Oct 27, 2021