code for the ICLR'22 paper: On Robust Prefix-Tuning for Text Classification

Overview

On Robust Prefix-Tuning for Text Classification

Prefix-tuning has drawed much attention as it is a parameter-efficient and modular alternative to adapting pretrained language models to downstream tasks. However, we find that prefix-tuning suffers from adversarial attacks. While, unfortunately, current robust NLP methods are unsuitable for prefix-tuning as they will inevitably hamper the modularity of prefix-tuning. In our ICLR'22 paper, we propose robust prefix-tuning for text classification. Our method leverages the idea of test-time tuning, which preserves the strengths of prefix-tuning and improves its robustness at the same time. This repository contains the code for the proposed robust prefix-tuning method.

Prerequisite

PyTorch>=1.2.0, pytorch-transformers==1.2.0, OpenAttack==2.0.1, and GPUtil==1.4.0.

Train the original prefix P_θ

For the training phase of standard prefix-tuning, the command is:

  source train.sh --preseqlen [A] --learning_rate [B] --tasks [C] --n_train_epochs [D] --device [E]

where

  • [A]: The length of the prefix P_θ.
  • [B]: The (initial) learning rate.
  • [C]: The benchmark. Default: sst.
  • [D]: The total epochs during training.
  • [E]: The id of the GPU to be used.

We can also use adversarial training to improve the robustness of the prefix. For the training phase of adversarial prefix-tuning, the command is:

  source train_adv.sh --preseqlen [A] --learning_rate [B] --tasks [C] --n_train_epochs [D] --device [E] --pgd_ball [F]

where

  • [A]~[E] have the same meanings with above.
  • [F]: where norm ball is word-wise or sentence-wise.

Note that the DATA_DIR and MODEL_DIR in train_adv.sh are different from those in train.sh. When experimenting with the adversarially trained prefix P_θ's in the following steps, remember to switch the DATA_DIR and MODEL_DIR in the corresponding scripts as well.

Generate Adversarial Examples

We use the OpenAttack package to generate in-sentence adversaries. The command is:

  source generate_adv_insent.sh --preseqlen [A] --learning_rate [B] --tasks [C] --device [E] --test_ep [G] --attack [H]

where

  • [A],[B],[C],[E] have the same meanings with above.
  • [G]: Load the prefix P_θ parameters trained for [G] epochs for testing. We set G=D.
  • [H]: Generate adversarial examples based on clean test set with the in-sentence attack [H].

We also implement the Universal Adversarial Trigger attack. The command is:

  source generate_adv_uat.sh --preseqlen [A] --learning_rate [B] --tasks [C] --device [E] --test_ep [G] --attack clean-[H2] --uat_len [I] --uat_epoch [J]

where

  • [A],[B],[C],[E],[G] have the same meanings with above.
  • [H2]: We should search for UATs for each class in the benchmark, and H2 indicates the class id. H2=0/1 for SST, 0/1/2/3 for AG News, and 0/1/2 for SNLI.
  • [I]: The length of the UAT.
  • [J]: The epochs for exploiting UAT.

Test the performance of P_θ

The command for performance testing of P_θ under clean data and in-sentence attacks is:

  source test_prefix_theta_insent.sh --preseqlen [A] --learning_rate [B] --tasks [C] --device [E] --test_ep [G] --attack [H] --test_batch_size [K]

Under UAT attack, the test command is:

  source test_prefix_theta_uat.sh --preseqlen [A] --learning_rate [B] --tasks [C] --device [E] --test_ep [G] --attack clean --uat_len [I] --test_batch_size [K]

where

  • [A]~[I] have the same meanings with above.
  • [K]: The test batch size. when K=0, the batch size is adaptive (determined by GPU memory); when K>0, the batch size is fixed.

Robust Prefix P'_ψ: Constructing the canonical manifolds

By constructing the canonical manifolds with PCA, we get the projection matrices. The command is:

  source get_proj.sh --preseqlen [A] --learning_rate [B] --tasks [C] --device [E] --test_ep [G]

where [A]~[G] have the same meanings with above.

Robust Prefix P'_ψ: Test its performance

Under clean data and in-sentence attacks, the command is:

  source test_robust_prefix_psi_insent.sh --preseqlen [A] --learning_rate [B] --tasks [C] --device [E] --test_ep [G] --attack [H] --test_batch_size [K] --PMP_lr [L] --PMP_iter [M]

Under UAT attack, the test command is:

  source test_robust_prefix_psi_uat.sh --preseqlen [A] --learning_rate [B] --tasks [C] --device [E] --test_ep [G] --attack clean --uat_len [I] --test_batch_size [K] --PMP_lr [L] --PMP_iter [M]

where

  • [A]~[K] have the same meanings with above.
  • [L]: The learning rate for test-time P'_ψ tuning.
  • [M]: The iterations for test-time P'_ψ tuning.

Running Example

# Train the original prefix P_θ
source train.sh --tasks sst --n_train_epochs 100 --device 0
source train_adv.sh --tasks sst --n_train_epochs 100 --device 1 --pgd_ball word

# Generate Adversarial Examples
source generate_adv_insent.sh --tasks sst --device 0 --test_ep 100 --attack bug
source generate_adv_uat.sh --tasks sst --device 0 --test_ep 100 --attack clean-0 --uat_len 3 --uat_epoch 10
source generate_adv_uat.sh --tasks sst --device 0 --test_ep 100 --attack clean-1 --uat_len 3 --uat_epoch 10

# Test the performance of P_θ
source test_prefix_theta_insent.sh --tasks sst --device 0 --test_ep 100 --attack bug --test_batch_size 0
source test_prefix_theta_uat.sh --tasks sst --device 0 --test_ep 100 --attack clean --uat_len 3 --test_batch_size 0

# Robust Prefix P'_ψ: Constructing the canonical manifolds
source get_proj.sh --tasks sst --device 0 --test_ep 100

# Robust Prefix P'_ψ: Test its performance
source test_robust_prefix_psi_insent.sh --tasks sst --device 0 --test_ep 100 --attack bug --test_batch_size 0 --PMP_lr 0.15 --PMP_iter 10
source test_robust_prefix_psi_uat.sh --tasks sst --device 0 --test_ep 100 --attack clean --uat_len 3 --test_batch_size 0 --PMP_lr 0.05 --PMP_iter 10

Released Data & Models

The training the original prefix P_θ and the process of generating adversarial examples can be time-consuming. As shown in our paper, the adversarial prefix-tuning is particularly slow. Efforts need to be paid on generating adversaries as well, since different attacks are to be performed on the test set based on each trained prefix. We also found that OpenAttack is now upgraded to v2.1.1, which causes compatibility issues in our codes (test_prefix_theta_insent.py).

In order to facilitate research on the robustness of prefix-tuning, we release the prefix checkpoints P_θ (with both std. and adv. training), the processed test sets that are perturbed by in-sentence attacks (including PWWS and TextBugger), as well as the generated projection matrices of the canonical manifolds in our runs for reproducibility and further enhancement. We have also hard-coded the exploited UAT tokens in test_prefix_theta_uat.py and test_robust_prefix_psi_uat.py. All the materials can be found here.

Acknowledgements:

The implementation of robust prefix tuning is based on the LAMOL repo, which is the code of LAMOL: LAnguage MOdeling for Lifelong Language Learning that studies NLP lifelong learning with GPT-style pretrained language models.

Bibtex

If you find this repository useful for your research, please consider citing our work:

@inproceedings{
  yang2022on,
  title={On Robust Prefix-Tuning for Text Classification},
  author={Zonghan Yang and Yang Liu},
  booktitle={International Conference on Learning Representations},
  year={2022},
  url={https://openreview.net/forum?id=eBCmOocUejf}
}
Owner
Zonghan Yang
Graduate student in Tsinghua University. Two drifters, off to see the world - there's such a lot of world to see...
Zonghan Yang
The first machine learning framework that encourages learning ML concepts instead of memorizing class functions.

SeaLion is designed to teach today's aspiring ml-engineers the popular machine learning concepts of today in a way that gives both intuition and ways of application. We do this through concise algori

Anish 324 Dec 27, 2022
The Adapter-Bot: All-In-One Controllable Conversational Model

The Adapter-Bot: All-In-One Controllable Conversational Model This is the implementation of the paper: The Adapter-Bot: All-In-One Controllable Conver

CAiRE 37 Nov 04, 2022
Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation The code of: Context Decoupling Augmentation for Weakly Supervised Semanti

54 Dec 12, 2022
Differentiable Neural Computers, Sparse Access Memory and Sparse Differentiable Neural Computers, for Pytorch

Differentiable Neural Computers and family, for Pytorch Includes: Differentiable Neural Computers (DNC) Sparse Access Memory (SAM) Sparse Differentiab

ixaxaar 302 Dec 14, 2022
Trafffic prediction analysis using hybrid models - Machine Learning

Hybrid Machine learning Model Clone the Repository Create a new Directory as assests and download the model from the below link Model Link To Start th

1 Feb 08, 2022
Fit Fast, Explain Fast

FastExplain Fit Fast, Explain Fast Installing pip install fast-explain About FastExplain FastExplain provides an out-of-the-box tool for analysts to

8 Dec 15, 2022
The Environment I built to study Reinforcement Learning + Pokemon Showdown

pokemon-showdown-rl-environment The Environment I built to study Reinforcement Learning + Pokemon Showdown Been a while since I ran this. Think it is

3 Jan 16, 2022
Fully convolutional deep neural network to remove transparent overlays from images

Fully convolutional deep neural network to remove transparent overlays from images

Marc Belmont 1.1k Jan 06, 2023
Dieser Scanner findet Websites, die nicht direkt in Suchmaschinen auftauchen, aber trotzdem erreichbar sind.

Deep Web Scanner Dieses Script findet Websites, die per IPv4-Adresse erreichbar sind und speichert deren Metadaten. Die Ausgabe im Terminal wird nach

Alex K. 30 Nov 18, 2022
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Deepvoice3_pytorch PyTorch implementation of convolutional networks-based text-to-speech synthesis models: arXiv:1710.07654: Deep Voice 3: Scaling Tex

Ryuichi Yamamoto 1.8k Jan 08, 2023
Source code for CVPR 2021 paper "Riggable 3D Face Reconstruction via In-Network Optimization"

Riggable 3D Face Reconstruction via In-Network Optimization Source code for CVPR 2021 paper "Riggable 3D Face Reconstruction via In-Network Optimizati

130 Jan 02, 2023
Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic [Paper] [Colab is coming soon] Approach Example Usage To r

170 Jan 03, 2023
OpenMMLab Pose Estimation Toolbox and Benchmark.

Introduction English | 简体中文 MMPose is an open-source toolbox for pose estimation based on PyTorch. It is a part of the OpenMMLab project. The master b

OpenMMLab 2.8k Dec 31, 2022
2021 Artificial Intelligence Diabetes Datathon

A.I.D.D. 2021 2021 Artificial Intelligence Diabetes Datathon A.I.D.D. 2021은 ‘2021 인공지능 학습용 데이터 구축사업’을 통해 만들어진 학습용 데이터를 활용하여 당뇨병을 효과적으로 예측할 수 있는가에 대한 A

2 Dec 27, 2021
Posterior temperature optimized Bayesian models for inverse problems in medical imaging

Posterior temperature optimized Bayesian models for inverse problems in medical imaging Max-Heinrich Laves*, Malte Tölle*, Alexander Schlaefer, Sandy

Artificial Intelligence in Cardiovascular Medicine (AICM) 6 Sep 19, 2022
The 2nd Version Of Slothybot

SlothyBot Go to this website: "https://bitly.com/SlothyBot" The 2nd Version Of Slothybot. The Bot Has Many Features, Such As: Moderation Commands; Kic

Slothy 0 Jun 01, 2022
an implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation using PyTorch

revisiting-sepconv This is a reference implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation [1] using PyTorch. Given two f

Simon Niklaus 59 Dec 22, 2022
ByteTrack(Multi-Object Tracking by Associating Every Detection Box)のPythonでのONNX推論サンプル

ByteTrack-ONNX-Sample ByteTrack(Multi-Object Tracking by Associating Every Detection Box)のPythonでのONNX推論サンプルです。 ONNXに変換したモデルも同梱しています。 変換自体を試したい方はByteT

KazuhitoTakahashi 16 Oct 26, 2022
The implemention of Video Depth Estimation by Fusing Flow-to-Depth Proposals

Flow-to-depth (FDNet) video-depth-estimation This is the implementation of paper Video Depth Estimation by Fusing Flow-to-Depth Proposals Jiaxin Xie,

32 Jun 14, 2022
Worktory is a python library created with the single purpose of simplifying the inventory management of network automation scripts.

Worktory is a python library created with the single purpose of simplifying the inventory management of network automation scripts.

Renato Almeida de Oliveira 18 Aug 31, 2022