Skip to content

sayakpaul/Denoised-Smoothing-TF

Repository files navigation

Denoised-Smoothing-TF

Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow. This implementation is now a part of Neural Structured Learning.

Denoised Smoothing is a simple and elegant way to (provably) robustify pre-trained image classification models (including the cloud APIs with only query access) and l2 adversarial attacks. This blog post provides a nice introduction to the method. The figure below summarizes what Denoised Smoothing is and how it works:


  • Take a pre-trained classifier and prepend a pre-trained denoiser with it. Of course, the dataset on which the classifier and the denoiser would need to be trained on the same/similar dataset.
  • Apply Randomized Smoothing.

Randomized Smoothing is a well-tested method to provably defend against l2 adversarial attacks under a specific radii. But it assumes that a classifier performs well under Gaussian noisy perturbations which may not always be the case.

Note: I utilized many scripts from the official repository of Denoised Smoothing to develop this repository. My aim with this repository is to provide a template for researchers to conduct certification tests with Keras/TensorFlow models. I encourage the readers to check out the original repository, it's really well-developed.

Further notes

All the notebooks can be executed on Colab! You also have the option to train using the free TPUs.

If you run into TypeError: Input 'y' of 'AddV2' Op has type float64 that does not match type float32 of argument 'x' error while training the denoiser, try the following (#1):

noise = tf.experimental.numpy.random.randn(batch_size, 32, 32, 3) * self.sigma
noise = tf.cast(noise, tf.float32)

This is not required if you are using TensorFlow 2.4.1.

Results

Denoiser with stability objective Denoiser with MSE objective

As we can see prepending a pre-trained denoiser is extremely helpful for our purpose.

Models

The models are available inside models.tar.gz in the SavedModel format. In the interest of reproducibility, the initial model weights are also provided.

Acknowledgements

Paper citation

@inproceedings{NEURIPS2020_f9fd2624,
 author = {Salman, Hadi and Sun, Mingjie and Yang, Greg and Kapoor, Ashish and Kolter, J. Zico},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
 pages = {21945--21957},
 publisher = {Curran Associates, Inc.},
 title = {Denoised Smoothing: A Provable Defense for Pretrained Classifiers},
 url = {https://proceedings.neurips.cc/paper/2020/file/f9fd2624beefbc7808e4e405d73f57ab-Paper.pdf},
 volume = {33},
 year = {2020}
}