git《Self-Attention Attribution: Interpreting Information Interactions Inside Transformer》(AAAI 2021) GitHub:

Last update: Dec 29, 2022

Related tags

Overview

Self-Attention Attribution

This repository contains the implementation for AAAI-2021 paper Self-Attention Attribution: Interpreting Information Interactions Inside Transformer. It includes the code for generating the self-attention attribution score, pruning attention heads with our method, constructing the attribution tree and extracting the adversarial triggers. All of our experiments are conducted on bert-base-cased model, our methods can also be easily transfered to other Transformer-based models.

Requirements

Python version >= 3.5
Pytorch version == 1.1.0
networkx == 2.3

We recommend you to run the code using the docker under Linux:

docker run -it --rm --runtime=nvidia --ipc=host --privileged pytorch/pytorch:1.1.0-cuda10.0-cudnn7.5-devel bash

Then install the following packages with pip:

pip install --user networkx==2.3
pip install --user matplotlib==3.1.0
pip install --user tensorboardX six numpy tqdm scikit-learn

You can install attattr from source:

git clone https://github.com/YRdddream/attattr
cd attattr
pip install --user --editable .

Download Pre-Finetuned Models and Datasets

Before running self-attention attribution, you can first fine-tune bert-base-cased model on a downstream task (such as MNLI) by running the file run_classifier_orig.py. We also provide the example datasets and the pre-finetuned checkpoints at Google Drive.

Get Self-Attention Attribution Scores

Run the following command to get the self-attention attribution score and the self-attention score.

python examples/generate_attrscore.py --task_name ${task_name} --data_dir ${data_dir} \
       --bert_model bert-base-cased --batch_size 16 --num_batch 4 \
       --model_file ${model_file} --example_index ${example_index} \
       --get_att_attr --get_att_score --output_dir ${output_dir}

Construction of Attribution Tree

When you get the self-attribution scores of a target example, you could construct the attribution tree. We recommend you to run the file get_tokens_and_pred.py to summarize the data, or you can manually change the value of tokens in attribution_tree.py.

python examples/attribution_tree.py --attr_file ${attr_file} --tokens_file ${tokens_file} \
       --task_name ${task_name} --example_index ${example_index}

You can generate the attribution tree from the provided example.

python examples/attribution_tree.py --attr_file ${model_and_data}/mnli_example/attr_zero_base_exp16.json \
       --tokens_file ${model_and_data}/mnli_example/tokens_and_pred_100.json \
       --task_name mnli --example_index 16

Self-Attention Head Pruning

We provide the code of pruning attention heads with both our attribution method and the Taylor expansion method. Pruning heads with our method.

python examples/prune_head_with_attr.py --task_name ${task_name} --data_dir ${data_dir} \
       --bert_model bert-base-cased --model_file ${model_file}  --output_dir ${output_dir}

Pruning heads with Taylor expansion method.

python examples/prune_head_with_taylor.py --task_name ${task_name} --data_dir ${data_dir} \
       --bert_model bert-base-cased --model_file ${model_file}  --output_dir ${output_dir}

Adversarial Attack

First extract the most important connections from the training dataset.

python examples/run_adver_connection.py --task_name ${task_name} --data_dir ${data_dir} \
       --bert_model bert-base-cased --batch_size 16 --num_batch 4 --zero_baseline \
       --model_file ${model_file} --output_dir ${output_dir}

Then use these adversarial triggers to attack the original model.

python examples/run_adver_evaluate.py --task_name ${task_name} --data_dir ${data_dir} \
       --bert_model bert-base-cased --model_file ${model_file} \
       --output_dir ${output_dir} --pattern_file ${pattern_file}

Reference

If you find this repository useful for your work, you can cite the paper:

@inproceedings{attattr,
  author = {Yaru Hao and Li Dong and Furu Wei and Ke Xu},
  title = {Self-Attention Attribution: Interpreting Information Interactions Inside Transformer},
  booktitle = {The Thirty-Fifth {AAAI} Conference on Artificial Intelligence},
  publisher = {{AAAI} Press},
  year      = {2021},
  url       = {https://arxiv.org/pdf/2004.11207.pdf}
}

git《Self-Attention Attribution: Interpreting Information Interactions Inside Transformer》(AAAI 2021) GitHub:

Related tags

Overview

Self-Attention Attribution

Requirements

Download Pre-Finetuned Models and Datasets

Get Self-Attention Attribution Scores

Construction of Attribution Tree

Self-Attention Head Pruning

Adversarial Attack

Reference

Owner

Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset

Deep Learning and Logical Reasoning from Data and Knowledge

This is our ARTS test set, an enriched test set to probe Aspect Robustness of ABSA.

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

Semi-automated OpenVINO benchmark_app with variable parameters

The source code and dataset for the RecGURU paper (WSDM 2022)

PyTorch implementation of MuseMorphose, a Transformer-based model for music style transfer.

Direct application of DALLE-2 to video synthesis, using factored space-time Unet and Transformers

using yolox+deepsort for object-tracker

A framework for multi-step probabilistic time-series/demand forecasting models

Vertex AI: Serverless framework for MLOPs (ESP / ENG)

A Strong Baseline for Image Semantic Segmentation

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

这是一个facenet-pytorch的库，可以用于训练自己的人脸识别模型。

Surrogate- and Invariance-Boosted Contrastive Learning (SIB-CL)

Pytorch implementation of Learning with Opponent-Learning Awareness

Talk covering the features of skorch

Image-Scaling Attacks and Defenses

PyTorch implementation of paper A Fast Knowledge Distillation Framework for Visual Recognition.