[NeurIPS 2021] "G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators"

Overview

G-PATE

This is the official code base for our NeurIPS 2021 paper:

"G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators."

Yunhui Long*, Boxin Wang*, Zhuolin Yang, Bhavya Kailkhura, Aston Zhang, Carl A. Gunter, Bo Li

Citation

@article{long2021gpate,
  title={G-PATE: Scalable Differentially Private Data Generator via Private Aggregation of Teacher Discriminators},
  author={Long, Yunhui and Wang, Boxin and Yang, Zhuolin and Kailkhura, Bhavya and Zhang, Aston and Gunter, Carl A. and Li, Bo},
  journal={NeurIPS 2021},
  year={2021}
}

Usage

Prepare your environment

Download required packages

pip install -r requirements.txt

Prepare your data

Please store the training data in $data_dir. By default, $data_dir is set to ../../data.

We provide a script to download the MNIST and Fashion Mnist datasets.

python download.py [dataset_name]

For MNIST, you can run

python download.py mnist

For Fashion-MNIST, you can run

python download.py fashion_mnist

For CelebA datasets, please refer to their official websites for downloading.

Training

python main.py --checkpoint_dir [checkpoint_dir] --dataset [dataset_name] --train

Example of one of our best commands on MNIST:

Given eps=1,

python main.py --checkpoint_dir mnist_teacher_4000_z_dim_50_c_1e-4/ --teachers_batch 40 --batch_teachers 100 --dataset mnist --train --sigma_thresh 3000 --sigma 1000 --step_size 1e-4 --max_eps 1 --nopretrain --z_dim 50 --batch_size 64

By default, after it reaches the max epsilon=1, it will generate 100,000 DP samples as eps-1.00.data.pkl in checkpoint_dir.

Given eps=10,

python main.py --checkpoint_dir mnist_teacher_2000_z_dim_100_eps_10/ --teachers_batch 40 --batch_teachers 50 --dataset mnist --train --sigma_thresh 600 --sigma 100 --step_size 1e-4 --max_eps 10 --nopretrain --z_dim 100 --batch_size 64

By default, after it reaches the max epsilon=10, it will generate 100,000 DP samples as eps-9.9x.data.pkl in checkpoint_dir.

Generating synthetic samples

python main.py --checkpoint_dir [checkpoint_dir] --dataset [dataset_name]

Evaluate the synthetic records

We follow the standard the protocl and train a classifier on synthetic samples and test it on real samples.

For MNIST,

python evaluation/train-classifier-mnist.py --data [DP_data_dir]

For Fashion-MNIST,

python evaluation/train-classifier-fmnist.py --data [DP_data_dir]

For CelebA-Gender,

python evaluation/train-classifier-celebA.py --data [DP_data_dir]

For CelebA-Gender (Small),

python evaluation/train-classifier-small-celebA.py --data [DP_data_dir]

For CelebA-Hair,

python evaluation/train-classifier-hair.py --data [DP_data_dir]

The [DP_data_dir] is where your generated DP samples are located.

In the MNIST example above, we have generated DP samples in $checkpoint_dir/eps-1.00.data.

During evaluation, you should run with DP_data_dir=$checkpoint_dir/eps-1.00.data.

python evaluation/train-classifier-mnist.py --data $checkpoint_dir/eps-1.00.data
Owner
AI Secure
UIUC Secure Learning Lab
AI Secure
TensorFlow Metal Backend on Apple Silicon Experiments (just for fun)

tf-metal-experiments TensorFlow Metal Backend on Apple Silicon Experiments (just for fun) Setup This is tested on M1 series Apple Silicon SOC only. Te

Timothy Liu 161 Jan 03, 2023
Lite-HRNet: A Lightweight High-Resolution Network

LiteHRNet Benchmark πŸ”₯ πŸ”₯ Based on MMsegmentation πŸ”₯ πŸ”₯ Cityscapes FCN resize concat config mIoU last mAcc last eval last mIoU best mAcc best eval bes

16 Dec 12, 2022
Deep learning image registration library for PyTorch

TorchIR: Pytorch Image Registration TorchIR is a image registration library for deep learning image registration (DLIR). I have integrated several ide

Bob de Vos 40 Dec 16, 2022
Autoregressive Models in PyTorch.

Autoregressive This repository contains all the necessary PyTorch code, tailored to my presentation, to train and generate data from WaveNet-like auto

Christoph Heindl 41 Oct 09, 2022
Learning Saliency Propagation for Semi-supervised Instance Segmentation

Learning Saliency Propagation for Semi-supervised Instance Segmentation PyTorch Implementation This repository contains: the PyTorch implementation of

Berkeley DeepDrive 68 Oct 18, 2022
September-Assistant - Open-source Windows Voice Assistant

September - Windows Assistant September is an open-source Windows personal assis

The Nithin Balaji 9 Nov 22, 2022
Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization

Sync2Gen Code for ICCV 2021 paper: Scene Synthesis via Uncertainty-Driven Attribute Synchronization 0. Environment Environment: python 3.6 and cuda 10

Haitao Yang 62 Dec 30, 2022
Code for ICDM2020 full paper: "Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning"

Subg-Con Sub-graph Contrast for Scalable Self-Supervised Graph Representation Learning (Jiao et al., ICDM 2020): https://arxiv.org/abs/2009.10273 Over

34 Jul 06, 2022
Semantic Segmentation of images using PixelLib with help of Pascalvoc dataset trained with Deeplabv3+ framework.

CARscan- Approach 1 - Segmentation of images by detecting contours. It failed because in images with elements along with cars were also getting detect

Padmanabha Banerjee 5 Jul 29, 2021
Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

[TCSVT] Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization LPN [Paper] NEWs Prerequisites Python 3.6 GPU Memory = 8G Numpy 1.

46 Dec 14, 2022
Neural Radiance Fields Using PyTorch

This project is a PyTorch implementation of Neural Radiance Fields (NeRF) for reproduction of results whilst running at a faster speed.

Vedant Ghodke 1 Feb 11, 2022
The official implementation of CSG-Stump: A Learning Friendly CSG-Like Representation for Interpretable Shape Parsing

CSGStumpNet The official implementation of CSG-Stump: A Learning Friendly CSG-Like Representation for Interpretable Shape Parsing Paper | Project page

Daxuan 39 Dec 26, 2022
codes for Self-paced Deep Regression Forests with Consideration on Ranking Fairness

Self-paced Deep Regression Forests with Consideration on Ranking Fairness This is official codes for paper Self-paced Deep Regression Forests with Con

Learning in Vision 4 Sep 11, 2022
Template repository to build PyTorch projects from source on any version of PyTorch/CUDA/cuDNN.

The Ultimate PyTorch Source-Build Template Translations: ν•œκ΅­μ–΄ TL;DR PyTorch built from source can be x4 faster than a naΓ―ve PyTorch install. This repos

Joonhyung Lee/μ΄μ€€ν˜• 651 Dec 12, 2022
The code for Expectation-Maximization Attention Networks for Semantic Segmentation (ICCV'2019 Oral)

EMANet News The bug in loading the pretrained model is now fixed. I have updated the .pth. To use it, download it again. EMANet-101 gets 80.99 on the

Xia Li 李倏 663 Nov 30, 2022
Winning solution of the Indoor Location & Navigation Kaggle competition

This repository contains the code to generate the winning solution of the Kaggle competition on indoor location and navigation organized by Microsoft

Tom Van de Wiele 62 Dec 28, 2022
Physics-Informed Neural Networks (PINN) and Deep BSDE Solvers of Differential Equations for Scientific Machine Learning (SciML) accelerated simulation

NeuralPDE NeuralPDE.jl is a solver package which consists of neural network solvers for partial differential equations using scientific machine learni

SciML Open Source Scientific Machine Learning 680 Jan 02, 2023
When are Iterative GPs Numerically Accurate?

When are Iterative GPs Numerically Accurate? This is a code repository for the paper "When are Iterative GPs Numerically Accurate?" by Wesley Maddox,

Wesley Maddox 1 Jan 06, 2022
ContourletNet: A Generalized Rain Removal Architecture Using Multi-Direction Hierarchical Representation

ContourletNet: A Generalized Rain Removal Architecture Using Multi-Direction Hierarchical Representation (Accepted by BMVC'21) Abstract: Images acquir

10 Dec 08, 2022
The open-source and free to use Python package miseval was developed to establish a standardized medical image segmentation evaluation procedure

miseval: a metric library for Medical Image Segmentation EVALuation The open-source and free to use Python package miseval was developed to establish

59 Dec 10, 2022