git《Self-Attention Attribution: Interpreting Information Interactions Inside Transformer》(AAAI 2021) GitHub:

Related tags

Deep Learningattattr
Overview

Self-Attention Attribution

This repository contains the implementation for AAAI-2021 paper Self-Attention Attribution: Interpreting Information Interactions Inside Transformer. It includes the code for generating the self-attention attribution score, pruning attention heads with our method, constructing the attribution tree and extracting the adversarial triggers. All of our experiments are conducted on bert-base-cased model, our methods can also be easily transfered to other Transformer-based models.

Requirements

  • Python version >= 3.5
  • Pytorch version == 1.1.0
  • networkx == 2.3

We recommend you to run the code using the docker under Linux:

docker run -it --rm --runtime=nvidia --ipc=host --privileged pytorch/pytorch:1.1.0-cuda10.0-cudnn7.5-devel bash

Then install the following packages with pip:

pip install --user networkx==2.3
pip install --user matplotlib==3.1.0
pip install --user tensorboardX six numpy tqdm scikit-learn

You can install attattr from source:

git clone https://github.com/YRdddream/attattr
cd attattr
pip install --user --editable .

Download Pre-Finetuned Models and Datasets

Before running self-attention attribution, you can first fine-tune bert-base-cased model on a downstream task (such as MNLI) by running the file run_classifier_orig.py. We also provide the example datasets and the pre-finetuned checkpoints at Google Drive.

Get Self-Attention Attribution Scores

Run the following command to get the self-attention attribution score and the self-attention score.

python examples/generate_attrscore.py --task_name ${task_name} --data_dir ${data_dir} \
       --bert_model bert-base-cased --batch_size 16 --num_batch 4 \
       --model_file ${model_file} --example_index ${example_index} \
       --get_att_attr --get_att_score --output_dir ${output_dir}

Construction of Attribution Tree

When you get the self-attribution scores of a target example, you could construct the attribution tree. We recommend you to run the file get_tokens_and_pred.py to summarize the data, or you can manually change the value of tokens in attribution_tree.py.

python examples/attribution_tree.py --attr_file ${attr_file} --tokens_file ${tokens_file} \
       --task_name ${task_name} --example_index ${example_index} 

You can generate the attribution tree from the provided example.

python examples/attribution_tree.py --attr_file ${model_and_data}/mnli_example/attr_zero_base_exp16.json \
       --tokens_file ${model_and_data}/mnli_example/tokens_and_pred_100.json \
       --task_name mnli --example_index 16

Self-Attention Head Pruning

We provide the code of pruning attention heads with both our attribution method and the Taylor expansion method. Pruning heads with our method.

python examples/prune_head_with_attr.py --task_name ${task_name} --data_dir ${data_dir} \
       --bert_model bert-base-cased --model_file ${model_file}  --output_dir ${output_dir}

Pruning heads with Taylor expansion method.

python examples/prune_head_with_taylor.py --task_name ${task_name} --data_dir ${data_dir} \
       --bert_model bert-base-cased --model_file ${model_file}  --output_dir ${output_dir}

Adversarial Attack

First extract the most important connections from the training dataset.

python examples/run_adver_connection.py --task_name ${task_name} --data_dir ${data_dir} \
       --bert_model bert-base-cased --batch_size 16 --num_batch 4 --zero_baseline \
       --model_file ${model_file} --output_dir ${output_dir}

Then use these adversarial triggers to attack the original model.

python examples/run_adver_evaluate.py --task_name ${task_name} --data_dir ${data_dir} \
       --bert_model bert-base-cased --model_file ${model_file} \
       --output_dir ${output_dir} --pattern_file ${pattern_file}

Reference

If you find this repository useful for your work, you can cite the paper:

@inproceedings{attattr,
  author = {Yaru Hao and Li Dong and Furu Wei and Ke Xu},
  title = {Self-Attention Attribution: Interpreting Information Interactions Inside Transformer},
  booktitle = {The Thirty-Fifth {AAAI} Conference on Artificial Intelligence},
  publisher = {{AAAI} Press},
  year      = {2021},
  url       = {https://arxiv.org/pdf/2004.11207.pdf}
}
Official repository for "Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring".

RNN-MBP Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring (AAAI-2022) by Chao Zhu, Hang Dong, Jinshan Pan

SIV-LAB 22 Aug 31, 2022
Code repository for Semantic Terrain Classification for Off-Road Autonomous Driving

BEVNet Datasets Datasets should be put inside data/. For example, data/semantic_kitti_4class_100x100. Training BEVNet-S Example: cd experiments bash t

(Brian) JoonHo Lee 24 Dec 12, 2022
Federated learning on graph, especially on graph neural networks (GNNs), knowledge graph, and private GNN.

Federated learning on graph, especially on graph neural networks (GNNs), knowledge graph, and private GNN.

keven 198 Dec 20, 2022
EqGAN - Improving GAN Equilibrium by Raising Spatial Awareness

EqGAN - Improving GAN Equilibrium by Raising Spatial Awareness Improving GAN Equilibrium by Raising Spatial Awareness Jianyuan Wang, Ceyuan Yang, Ying

GenForce: May Generative Force Be with You 149 Dec 19, 2022
This project aims to be a handler for input creation and running of multiple RICEWQ simulations.

What is autoRICEWQ? This project aims to be a handler for input creation and running of multiple RICEWQ simulations. What is RICEWQ? From the descript

Yass Fuentes 1 Feb 01, 2022
ML-Ensemble – high performance ensemble learning

A Python library for high performance ensemble learning ML-Ensemble combines a Scikit-learn high-level API with a low-level computational graph framew

Sebastian Flennerhag 764 Dec 31, 2022
Learning Off-Policy with Online Planning, CoRL 2021

LOOP: Learning Off-Policy with Online Planning Accepted in Conference of Robot Learning (CoRL) 2021. Harshit Sikchi, Wenxuan Zhou, David Held Paper In

Harshit Sikchi 24 Nov 22, 2022
基于AlphaPose的TensorRT加速

1. Requirements CUDA 11.1 TensorRT 7.2.2 Python 3.8.5 Cython PyTorch 1.8.1 torchvision 0.9.1 numpy 1.17.4 (numpy版本过高会出报错 this issue ) python-package s

52 Dec 06, 2022
DeepLab2: A TensorFlow Library for Deep Labeling

DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a unified and state-of-the-art TensorFlow codebase for dense pixel labeling tasks.

Google Research 845 Jan 04, 2023
Highly comparative time-series analysis

〰️ hctsa 〰️ : highly comparative time-series analysis hctsa is a software package for running highly comparative time-series analysis using Matlab (fu

Ben Fulcher 569 Dec 21, 2022
3D detection and tracking viewer (visualization) for kitti & waymo dataset

3D detection and tracking viewer (visualization) for kitti & waymo dataset

222 Jan 08, 2023
MMFlow is an open source optical flow toolbox based on PyTorch

Documentation: https://mmflow.readthedocs.io/ Introduction English | 简体中文 MMFlow is an open source optical flow toolbox based on PyTorch. It is a part

OpenMMLab 688 Jan 06, 2023
Supervised forecasting of sequential data in Python.

Supervised forecasting of sequential data in Python. Intro Supervised forecasting is the machine learning task of making predictions for sequential da

The Alan Turing Institute 54 Nov 15, 2022
Matthew Colbrook 1 Apr 08, 2022
HAR-stacked-residual-bidir-LSTMs - Deep stacked residual bidirectional LSTMs for HAR

HAR-stacked-residual-bidir-LSTM The project is based on this repository which is presented as a tutorial. It consists of Human Activity Recognition (H

Guillaume Chevalier 287 Dec 27, 2022
🌎 The Modern Declarative Data Flow Framework for the AI Empowered Generation.

🌎 JSONClasses JSONClasses is a declarative data flow pipeline and data graph framework. Official Website: https://www.jsonclasses.com Official Docume

Fillmula Inc. 53 Dec 09, 2022
This repo is to be freely used by ML devs to check the GAN performances without coding from scratch.

GANs for Fun Created because I can! GOAL The goal of this repo is to be freely used by ML devs to check the GAN performances without coding from scrat

Sagnik Roy 13 Jan 26, 2022
This implements the learning and inference/proposal algorithm described in "Learning to Propose Objects, Krähenbühl and Koltun"

Learning to propose objects This implements the learning and inference/proposal algorithm described in "Learning to Propose Objects, Krähenbühl and Ko

Philipp Krähenbühl 90 Sep 10, 2021
Generative Models as a Data Source for Multiview Representation Learning

GenRep Project Page | Paper Generative Models as a Data Source for Multiview Representation Learning Ali Jahanian, Xavier Puig, Yonglong Tian, Phillip

Ali 81 Dec 03, 2022
IDA file loader for UF2, created for the DEFCON 29 hardware badge

UF2 Loader for IDA The DEFCON 29 badge uses the UF2 bootloader, which conveniently allows you to dump and flash the firmware over USB as a mass storag

Kevin Colley 6 Feb 08, 2022