NeurIPS 2021

Title: Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (paper)

Authors: Junho Kim, Byung-Kwan Lee, and Yong Man Ro (*: equally contributed)

Affiliation: School of Electric Engineering, Korea Advanced Institute of Science and Technology (KAIST)

Email: `[email protected]`, `[email protected]`, `[email protected]`

This is official PyTorch Implementation code for the paper of "Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck" published in NeurIPS 21. It provides novel method of decomposing robust and non-robust features in intermediate layer. Further, we understand the semantic information of distilled features, by directly visualizing robust and non-robust features in the feature representation space. Consequently, we reveal that both of the robust and non-robust features indeed have semantic information in terms of human-perception by themselves. For more detail, you can refer to our paper!

Citation

If you find this work helpful, please cite it as:

@inproceedings{
kim2021distilling,
title={Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck},
author={Junho Kim and Byung-Kwan Lee and Yong Man Ro},
booktitle={Advances in Neural Information Processing Systems},
editor={A. Beygelzimer and Y. Dauphin and P. Liang and J. Wortman Vaughan},
year={2021},
url={https://openreview.net/forum?id=90M-91IZ0JC}
}

Datasets

Baseline Models

VGG-16 (model/vgg.py)
WideResNet-28-10 (model/wideresnet.py)

Adversarial Attacks (by torchattacks)

Fast Gradient Sign Method (FGSM)
Basic Iterative Method (BIM)
Projected Gradient Descent (PGD)
Carlini & Wagner (CW)
AutoAttack (AA)
Fast Adaptive Boundary (FAB)

This implementation details are described in loader/loader.py.

    # Gradient Clamping based Attack
    if args.attack == "fgsm":
        return torchattacks.FGSM(model=net, eps=args.eps)

    elif args.attack == "bim":
        return torchattacks.BIM(model=net, eps=args.eps, alpha=1/255)

    elif args.attack == "pgd":
        return torchattacks.PGD(model=net, eps=args.eps,
                                alpha=args.eps/args.steps*2.3, steps=args.steps, random_start=True)

    elif args.attack == "cw":
        return torchattacks.CW(model=net, c=0.1, lr=0.1, steps=200)

    elif args.attack == "auto":
        return torchattacks.APGD(model=net, eps=args.eps)

    elif args.attack == "fab":
        return torchattacks.FAB(model=net, eps=args.eps, n_classes=args.n_classes)

Included Packages (for Ours)

Informative Feature Package (model/IFP.py)
- Distilling robust and non-robust features in intermediate layer by Information Bottleneck
Visualization of robust and non-robust features (visualization/inversion.py)
Non-Robust Feature (NRF) and Robust Feature (RF) Attack (model/IFP.py)
- NRF : maximizing the magnitude of non-robust feature gradients
- NRF2 : minimizing the magnitude of non-robust feature gradients
- RF : maximizing the magnitude of robust feature gradients
- RF2 : minimizing the magnitude of robust feature gradients

Baseline Methods

Plain (Plain Training)

Run train_plain.py

  parser.add_argument('--lr', default=0.01, type=float, help='learning rate')
  parser.add_argument('--dataset', default='cifar10', type=str, help='dataset name')
  parser.add_argument('--network', default='vgg', type=str, help='network name')
  parser.add_argument('--gpu_id', default='0', type=str, help='gpu id')
  parser.add_argument('--data_root', default='./datasets', type=str, help='path to dataset')
  parser.add_argument('--epoch', default=60, type=int, help='epoch number')
  parser.add_argument('--batch_size', default=100, type=int, help='Batch size')
  parser.add_argument('--pretrained', default='false', type=str2bool, help='pretrained boolean')
  parser.add_argument('--batchnorm', default='true', type=str2bool, help='batchnorm boolean')
  parser.add_argument('--save_dir', default='./experiment', type=str, help='save directory')

AT (PGD Adversarial Training)

Run train_AT.py

  parser.add_argument('--lr', default=0.01, type=float, help='learning rate')
  parser.add_argument('--steps', default=10, type=int, help='adv. steps')
  parser.add_argument('--eps', default=0.03, type=float, help='max norm')
  parser.add_argument('--dataset', default='cifar10', type=str, help='dataset name')
  parser.add_argument('--network', default='vgg', type=str, help='network name')
  parser.add_argument('--gpu_id', default='0', type=str, help='gpu id')
  parser.add_argument('--data_root', default='./datasets', type=str, help='path to dataset')
  parser.add_argument('--epoch', default=60, type=int, help='epoch number')
  parser.add_argument('--batch_size', default=100, type=int, help='Batch size')
  parser.add_argument('--attack', default='pgd', type=str, help='attack type')
  parser.add_argument('--pretrained', default='false', type=str2bool, help='pretrained boolean')
  parser.add_argument('--batchnorm', default='true', type=str2bool, help='batchnorm boolean')
  parser.add_argument('--save_dir', default='./experiment', type=str, help='save directory')

TRADES (Recent defense method)

Run train_TRADES.py

  parser.add_argument('--lr', default=0.01, type=float, help='learning rate')
  parser.add_argument('--steps', default=10, type=int, help='adv. steps')
  parser.add_argument('--eps', default=0.03, type=float, help='max norm')
  parser.add_argument('--dataset', default='cifar10', type=str, help='dataset name')
  parser.add_argument('--network', default='wide', type=str, help='network name: vgg or wide')
  parser.add_argument('--gpu_id', default='0', type=str, help='gpu id')
  parser.add_argument('--data_root', default='./datasets', type=str, help='path to dataset')
  parser.add_argument('--epoch', default=60, type=int, help='epoch number')
  parser.add_argument('--batch_size', default=100, type=int, help='Batch size')
  parser.add_argument('--attack', default='pgd', type=str, help='attack type')
  parser.add_argument('--pretrained', default='false', type=str2bool, help='pretrained boolean')
  parser.add_argument('--batchnorm', default='true', type=str2bool, help='batchnorm boolean')
  parser.add_argument('--save_dir', default='./experiment', type=str, help='save directory')

MART (Recent defense method)

Run train_MART.py

  parser.add_argument('--lr', default=0.01, type=float, help='learning rate')
  parser.add_argument('--steps', default=10, type=int, help='adv. steps')
  parser.add_argument('--eps', default=0.03, type=float, help='max norm')
  parser.add_argument('--dataset', default='cifar10', type=str, help='dataset name')
  parser.add_argument('--network', default='wide', type=str, help='network name')
  parser.add_argument('--gpu_id', default='0', type=str, help='gpu id')
  parser.add_argument('--data_root', default='./datasets', type=str, help='path to dataset')
  parser.add_argument('--epoch', default=60, type=int, help='epoch number')
  parser.add_argument('--batch_size', default=100, type=int, help='Batch size')
  parser.add_argument('--attack', default='pgd', type=str, help='attack type')
  parser.add_argument('--pretrained', default='false', type=str2bool, help='pretrained boolean')
  parser.add_argument('--batchnorm', default='true', type=str2bool, help='batchnorm boolean')
  parser.add_argument('--save_dir', default='./experiment', type=str, help='save directory')

Testing Model Robustness

Mearsuring the robustness in baseline models trained with baseline methods

Run test.py

parser.add_argument('--steps', default=10, type=int, help='adv. steps')
parser.add_argument('--eps', default=0.03, type=float, help='max norm')
parser.add_argument('--dataset', default='cifar10', type=str, help='dataset name')
parser.add_argument('--network', default='vgg', type=str, help='network name')
parser.add_argument('--data_root', default='./datasets', type=str, help='path to dataset')
parser.add_argument('--gpu_id', default='0', type=str, help='gpu id')
parser.add_argument('--save_dir', default='./experiment', type=str, help='save directory')
parser.add_argument('--batch_size', default=100, type=int, help='Batch size')
parser.add_argument('--pop_number', default=3, type=int, help='Batch size')
parser.add_argument('--datetime', default='00000000', type=str, help='checkpoint datetime')
parser.add_argument('--pretrained', default='false', type=str2bool, help='pretrained boolean')
parser.add_argument('--batchnorm', default='true', type=str2bool, help='batchnorm boolean')
parser.add_argument('--baseline', default='AT', type=str, help='baseline')

Visualizing Robust and Non-Robust Features

Feature Interpreation

Run visualize.py

parser.add_argument('--lr', default=0.01, type=float, help='learning rate')
parser.add_argument('--steps', default=10, type=int, help='adv. steps')
parser.add_argument('--eps', default=0.03, type=float, help='max norm')
parser.add_argument('--dataset', default='cifar10', type=str, help='dataset name')
parser.add_argument('--network', default='vgg', type=str, help='network name')
parser.add_argument('--gpu_id', default='0', type=str, help='gpu id')
parser.add_argument('--data_root', default='./datasets', type=str, help='path to dataset')
parser.add_argument('--epoch', default=0, type=int, help='epoch number')
parser.add_argument('--attack', default='pgd', type=str, help='attack type')
parser.add_argument('--save_dir', default='./experiment', type=str, help='save directory')
parser.add_argument('--batch_size', default=1, type=int, help='Batch size')
parser.add_argument('--pop_number', default=3, type=int, help='Batch size')
parser.add_argument('--prior', default='AT', type=str, help='Plain or AT')
parser.add_argument('--prior_datetime', default='00000000', type=str, help='checkpoint datetime')
parser.add_argument('--pretrained', default='false', type=str2bool, help='pretrained boolean')
parser.add_argument('--batchnorm', default='true', type=str2bool, help='batchnorm boolean')
parser.add_argument('--vis_atk', default='True', type=str2bool, help='is attacked image?')

Adversarial-Information-Bottleneck - Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (NeurIPS21)

Related tags

Overview

NeurIPS 2021

Title: Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (paper)

Authors: Junho Kim, Byung-Kwan Lee, and Yong Man Ro (*: equally contributed)

Affiliation: School of Electric Engineering, Korea Advanced Institute of Science and Technology (KAIST)

Email: `[email protected]`, `[email protected]`, `[email protected]`

Citation

Datasets

Baseline Models

Adversarial Attacks (by torchattacks)

Included Packages (for Ours)

Baseline Methods

Testing Model Robustness

Visualizing Robust and Non-Robust Features

Owner

LBK

Jupyter notebooks for the code samples of the book "Deep Learning with Python"

PASTRIE: A Corpus of Prepositions Annotated with Supersense Tags in Reddit International English

StarGAN - Official PyTorch Implementation (CVPR 2018)

Betafold - AlphaFold with tunings

PyTorch implementation of the cross-modality generative model that synthesizes dance from music.

A Real-ESRGAN equipped Colab notebook for CLIP Guided Diffusion

Refactoring dalle-pytorch and taming-transformers for TPU VM

hySLAM is a hybrid SLAM/SfM system designed for mapping

DPC: Unsupervised Deep Point Correspondence via Cross and Self Construction (3DV 2021)

PyTorch implementation of our ICCV 2021 paper, Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents.

ARAE-Tensorflow for Discrete Sequences (Adversarially Regularized Autoencoder)

Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.

Code for the paper Progressive Pose Attention for Person Image Generation in CVPR19 (Oral).

Nb workflows - A workflow platform which allows you to run parameterized notebooks programmatically

Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)

The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).

Repo for "Physion: Evaluating Physical Prediction from Vision in Humans and Machines" submission to NeurIPS 2021 (Datasets & Benchmarks track)

Code for paper: Towards Tokenized Human Dynamics Representation

Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Deep learning library featuring a higher-level API for TensorFlow.

Adversarial-Information-Bottleneck - Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (NeurIPS21)

Related tags

Overview

NeurIPS 2021

Title: Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (paper)

Authors: Junho Kim*, Byung-Kwan Lee*, and Yong Man Ro (*: equally contributed)

Affiliation: School of Electric Engineering, Korea Advanced Institute of Science and Technology (KAIST)

Email: [email protected], [email protected], [email protected]

Citation

Datasets

Baseline Models

Adversarial Attacks (by torchattacks)

Included Packages (for Ours)

Baseline Methods

Testing Model Robustness

Visualizing Robust and Non-Robust Features

Owner

LBK

Jupyter notebooks for the code samples of the book "Deep Learning with Python"

PASTRIE: A Corpus of Prepositions Annotated with Supersense Tags in Reddit International English

StarGAN - Official PyTorch Implementation (CVPR 2018)

Betafold - AlphaFold with tunings

PyTorch implementation of the cross-modality generative model that synthesizes dance from music.

A Real-ESRGAN equipped Colab notebook for CLIP Guided Diffusion

Refactoring dalle-pytorch and taming-transformers for TPU VM

hySLAM is a hybrid SLAM/SfM system designed for mapping

DPC: Unsupervised Deep Point Correspondence via Cross and Self Construction (3DV 2021)

PyTorch implementation of our ICCV 2021 paper, Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents.

ARAE-Tensorflow for Discrete Sequences (Adversarially Regularized Autoencoder)

Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.

Code for the paper Progressive Pose Attention for Person Image Generation in CVPR19 (Oral).

Nb workflows - A workflow platform which allows you to run parameterized notebooks programmatically

Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)

The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).

Repo for "Physion: Evaluating Physical Prediction from Vision in Humans and Machines" submission to NeurIPS 2021 (Datasets & Benchmarks track)

Code for paper: Towards Tokenized Human Dynamics Representation

Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Deep learning library featuring a higher-level API for TensorFlow.

Authors: Junho Kim, Byung-Kwan Lee, and Yong Man Ro (*: equally contributed)

Email: `[email protected]`, `[email protected]`, `[email protected]`