Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

Last update: Jan 07, 2023

Overview

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers

1 Using Colab

Please notice that the notebook assumes that you are using a GPU. To switch runtime go to Runtime -> change runtime type and select GPU.
Installing all the requirements may take some time. After installation, please restart the runtime.

2 Running Examples

Notice that we have two jupyter notebooks to run the examples presented in the paper.

The notebook for LXMERT contains both the examples from the paper and examples with images from the internet and free form questions. To use your own input, simply change the URL variable to your image and the question variable to your free form question.
The notebook for DETR contains the examples from the paper. To use your own input, simply change the URL variable to your image.

3 Reproduction of results

3.1 VisualBERT

Run the run.py script as follows:

CUDA_VISIBLE_DEVICES=0 PYTHONPATH=`pwd` python VisualBERT/run.py --method=<method_name> --is-text-pert=<true/false> --is-positive-pert=<true/false> --num-samples=10000 config=projects/visual_bert/configs/vqa2/defaults.yaml model=visual_bert dataset=vqa2 run_type=val checkpoint.resume_zoo=visual_bert.finetuned.vqa2.from_coco_train env.data_dir=/path/to/data_dir training.num_workers=0 training.batch_size=1 training.trainer=mmf_pert training.seed=1234

Note

If the datasets aren't already in env.data_dir, then the script will download the data automatically to the path in env.data_dir.

3.2 LXMERT

Download valid.json:

pushd data/vqa
wget https://nlp.cs.unc.edu/data/lxmert_data/vqa/valid.json
popd

Download the COCO_val2014 set to your local machine.

Note

If you already downloaded COCO_val2014 for the VisualBERT tests, you can simply use the same path you used for VisualBERT.

Run the perturbation.py script as follows:

CUDA_VISIBLE_DEVICES=0 PYTHONPATH=`pwd` python lxmert/lxmert/perturbation.py  --COCO_path /path/to/COCO_val2014 --method <method_name> --is-text-pert <true/false> --is-positive-pert <true/false>

3.3 DETR

Download the COCO dataset as described in the DETR repository. Notice you only need the validation set.
Lower the IoU minimum threshold from 0.5 to 0.2 using the following steps:
- Locate the cocoeval.py script in your python library path:
  
  find library path:
```
import sys
print(sys.path)
```
  find cocoeval.py:
```
cd /path/to/lib
find -name cocoeval.py
```
- Change the self.iouThrs value in the setDetParams function (which sets the parameters for the COCO detection evaluation) in the Params class as follows:
  
  insead of:
```
self.iouThrs = np.linspace(.5, 0.95, int(np.round((0.95 - .5) / .05)) + 1, endpoint=True)
```
  use:
```
self.iouThrs = np.linspace(.2, 0.95, int(np.round((0.95 - .2) / .05)) + 1, endpoint=True)
```

Run the segmentation experiment, use the following command:

CUDA_VISIBLE_DEVICES=0 PYTHONPATH=`pwd`  python DETR/main.py --coco_path /path/to/coco/dataset  --eval --masks --resume https://dl.fbaipublicfiles.com/detr/detr-r50-e632da11.pth --batch_size 1 --method <method_name>

4 Credits

VisualBERT implementation is based on the MMF framework.
LXMERT implementation is based on the offical LXMERT implementation and on Hugging Face Transformers.
DETR implementation is based on the offical DETR implementation

Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.

Related tags

Overview

PyTorch Implementation of Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers

1 Using Colab

2 Running Examples

3 Reproduction of results

3.1 VisualBERT

3.2 LXMERT

3.3 DETR

4 Credits

Owner

Hila Chefer

Official PyTorch implementation of "ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows"

Source code for paper "Deep Diffusion Models for Robust Channel Estimation", TBA.

PyTorch Implementation of our paper Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

Boundary IoU API (Beta version)

A style-based Quantum Generative Adversarial Network

Official pytorch implementation of "DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion"

Exploring Simple Siamese Representation Learning

Stream images from a connected camera over MQTT, view using Streamlit, record to file and sqlite

In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021

A New Open-Source Off-road Environment for Benchmark Generalization of Autonomous Driving

Sound Source Localization for AI Grand Challenge 2021

SpiroMask: Measuring Lung Function Using Consumer-Grade Masks

Advbox is a toolbox to generate adversarial examples that fool neural networks in PaddlePaddle、PyTorch、Caffe2、MxNet、Keras、TensorFlow and Advbox can benchmark the robustness of machine learning models.

Rethinking Transformer-based Set Prediction for Object Detection

All course materials for the Zero to Mastery Deep Learning with TensorFlow course.

Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation

Drslmarkov - Distributionally Robust Structure Learning for Discrete Pairwise Markov Networks

Here is the diagnostic tool for BMVC 2021 paper Diagnosing Errors in Video Relation Detectors.

Semantic Edge Detection with Diverse Deep Supervision

Official PyTorch implementation of Data-free Knowledge Distillation for Object Detection, WACV 2021.