[ICCV 2021] Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain

Overview

Amplitude-Phase Recombination (ICCV'21)

Official PyTorch implementation of "Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain", Guangyao Chen, Peixi Peng, Li Ma, Jia Li, Lin Du, and Yonghong Tian.

Paper: https://arxiv.org/abs/2108.08487

Abstract: Recently, the generalization behavior of Convolutional Neural Networks (CNN) is gradually transparent through explanation techniques with the frequency components decomposition. However, the importance of the phase spectrum of the image for a robust vision system is still ignored. In this paper, we notice that the CNN tends to converge at the local optimum which is closely related to the high-frequency components of the training images, while the amplitude spectrum is easily disturbed such as noises or common corruptions. In contrast, more empirical studies found that humans rely on more phase components to achieve robust recognition. This observation leads to more explanations of the CNN's generalization behaviors in both adversarial attack and out-of-distribution detection, and motivates a new perspective on data augmentation designed by re-combing the phase spectrum of the current image and the amplitude spectrum of the distracter image. That is, the generated samples force the CNN to pay more attention on the structured information from phase components and keep robust to the variation of the amplitude. Experiments on several image datasets indicate that the proposed method achieves state-of-the-art performances on multiple generalizations and calibration tasks, including adaptability for common corruptions and surface variations, out-of-distribution detection and adversarial attack.

Highlights

Fig. 1: More empirical studies found that humans rely on more phase components to achieve robust recognition. However, CNN without effective training restrictions tends to converge at the local optimum related to the amplitude spectrum of the image, leading to generalization behaviors counter-intuitive to humans (the sensitive to various corruptions and the overconfidence of OOD). main hypothesis of the paper

Examples of the importance of phase spectrum to explain the counter-intuitive behavior of CNN

Fig. 2: Four pairs of testing sampless selected from in-distribution CIFAR-10 and OOD SVHN that help explain that CNN capture more amplitude specturm than phase specturm for classification: First, in (a) and (b), the model correctly predicts the original image (1st column in each panel), but the predicts are also exchanged after switching amplitude specturm (3rd column in each panel) while the human eye can still give the correct category through the contour information. Secondly, the model is overconfidence for the OOD samples in (c) and (d). Similarly, after the exchange of amplitude specturm, the label with high confidence is also exchanged.

Fig. 3: Two ways of the proposed Amplitude-Phase Recombination: APR-P and APR-S. Motivated by the powerful generalizability of the human, we argue that reducing the dependence on the amplitude spectrum and enhancing the ability to capture phase spectrum can improve the robustness of CNN.

Citation

If you find our work, this repository and pretrained adversarial generators useful. Please consider giving a star and citation.

@inproceedings{chen2021amplitude,
    title={Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain},
    author={Chen, Guangyao and Peng, Peixi and Ma, Li and Li, Jia and Du, Lin and Tian, Yonghong},
    booktitle={Proceedings of the IEEE International Conference on Computer Vision},
    year={2021}
}

1. Requirements

Environments

Currently, requires following packages

  • python 3.6+
  • torch 1.7.1+
  • torchvision 0.5+
  • CUDA 10.1+
  • scikit-learn 0.22+

Datasets

For even quicker experimentation, there is CIFAR-10-C and CIFAR-100-C. please download these datasets to ./data/CIFAR-10-C and ./data/CIFAR-100-C.

2. Training & Evaluation

To train the models in paper, run this command:

python main.py --aug <augmentations>

Option --aug can be one of None/APR-S. The default training method is APR-P. To evaluate the model, add --eval after this command.

APRecombination for APR-S and mix_data for APR-P can plug and play in other training codes.

3. Results

Fourier Analysis

The standard trained model is highly sensitive to additive noise in all but the lowest frequencies. APR-SP could substantially improve robustness to most frequency perturbations. The code of Heat maps is developed upon the following project FourierHeatmap.

ImageNet-C

  • Results of ResNet-50 models on ImageNet-C:
+(APR-P) +(APR-S) +(APR-SP) +DeepAugMent+(ARP-SP)
mCE 70.5 69.3 65.0 57.5
Owner
Guangyao Chen
Ph.D student @ PKU
Guangyao Chen
Pneumonia Detection using machine learning - with PyTorch

Pneumonia Detection Pneumonia Detection using machine learning. Training was done in colab: DEMO: Result (Confusion Matrix): Data I uploaded my datase

Wilhelm Berghammer 12 Jul 07, 2022
An SE(3)-invariant autoencoder for generating the periodic structure of materials

Crystal Diffusion Variational AutoEncoder This software implementes Crystal Diffusion Variational AutoEncoder (CDVAE), which generates the periodic st

Tian Xie 94 Dec 10, 2022
A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility

Tensorpack is a neural network training interface based on TensorFlow. Features: It's Yet Another TF high-level API, with speed, and flexibility built

Tensorpack 6.2k Jan 09, 2023
HAR-stacked-residual-bidir-LSTMs - Deep stacked residual bidirectional LSTMs for HAR

HAR-stacked-residual-bidir-LSTM The project is based on this repository which is presented as a tutorial. It consists of Human Activity Recognition (H

Guillaume Chevalier 287 Dec 27, 2022
[ICLR 2021] Is Attention Better Than Matrix Decomposition?

Enjoy-Hamburger 🍔 Official implementation of Hamburger, Is Attention Better Than Matrix Decomposition? (ICLR 2021) Under construction. Introduction T

Gsunshine 271 Dec 29, 2022
Google Landmark Recogntion and Retrieval 2021 Solutions

Google Landmark Recogntion and Retrieval 2021 Solutions In this repository you can find solution and code for Google Landmark Recognition 2021 and Goo

Vadim Timakin 5 Nov 25, 2022
Ludwig Benchmarking Toolkit

Ludwig Benchmarking Toolkit The Ludwig Benchmarking Toolkit is a personalized benchmarking toolkit for running end-to-end benchmark studies across an

HazyResearch 17 Nov 18, 2022
Text Summarization - WCN — Weighted Contextual N-gram method for evaluation of Text Summarization

Text Summarization WCN — Weighted Contextual N-gram method for evaluation of Text Summarization In this project, I fine tune T5 model on Extreme Summa

Aditya Shah 1 Jan 03, 2022
Code for Transformer Hawkes Process, ICML 2020.

Transformer Hawkes Process Source code for Transformer Hawkes Process (ICML 2020). Run the code Dependencies Python 3.7. Anaconda contains all the req

Simiao Zuo 111 Dec 26, 2022
Human Action Controller - A human action controller running on different platforms.

Human Action Controller (HAC) Goal A human action controller running on different platforms. Fun Easy-to-use Accurate Anywhere Fun Examples Mouse Cont

27 Jul 20, 2022
Full body anonymization - Realistic Full-Body Anonymization with Surface-Guided GANs

Realistic Full-Body Anonymization with Surface-Guided GANs This is the official

Håkon Hukkelås 30 Nov 18, 2022
GANTheftAuto is a fork of the Nvidia's GameGAN

Description GANTheftAuto is a fork of the Nvidia's GameGAN, which is research focused on emulating dynamic game environments. The early research done

Harrison 801 Dec 27, 2022
Point cloud processing tool library.

Point Cloud ToolBox This point cloud processing tool library can be used to process point clouds, 3d meshes, and voxels. Environment python 3.7.5 Dep

ZhangXinyun 40 Dec 09, 2022
mmdetection version of TinyBenchmark.

introduction This project is an mmdetection version of TinyBenchmark. TODO list: add TinyPerson dataset and evaluation add crop and merge for image du

34 Aug 27, 2022
Conformer: Local Features Coupling Global Representations for Visual Recognition

Conformer: Local Features Coupling Global Representations for Visual Recognition (arxiv) This repository is built upon DeiT and timm Usage First, inst

Zhiliang Peng 378 Jan 08, 2023
auto-tuning momentum SGD optimizer

YellowFin YellowFin is an auto-tuning optimizer based on momentum SGD which requires no manual specification of learning rate and momentum. It measure

Jian Zhang 288 Nov 19, 2022
Single object tracking and segmentation.

Single/Multiple Object Tracking and Segmentation Codes and comparison of recent single/multiple object tracking and segmentation. News 💥 AutoMatch is

ZP ZHANG 385 Jan 02, 2023
Breaching - Breaching privacy in federated learning scenarios for vision and text

Breaching - A Framework for Attacks against Privacy in Federated Learning This P

Jonas Geiping 139 Jan 03, 2023
Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows.

Swin-Transformer Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows. For more details, ple

旷视天元 MegEngine 9 Mar 14, 2022
Machine Learning in Asset Management (by @firmai)

Machine Learning in Asset Management If you like this type of content then visit ML Quant site below: https://www.ml-quant.com/ Part One Follow this l

Derek Snow 1.5k Jan 02, 2023