Explaining neural decisions contrastively to alternative decisions.

Last update: Oct 16, 2022

Related tags

Deep Learning contrastive-explanations

Overview

Contrastive Explanations for Model Interpretability

This is the repository for the paper "Contrastive Explanations for Model Interpretability", about explaining neural model decisions against alternative decisions.

Authors: Alon Jacovi, Swabha Swayamdipta, Shauli Ravfogel, Yanai Elazar, Yejin Choi, Yoav Goldberg

Getting Started

Setup

conda create -n contrastive python=3.8
conda activate contrastive
pip install allennlp==1.2.0rc1
pip install allennlp-models==1.2.0rc1.dev20201014
pip install jupyterlab
pip install pandas
bash scripts/download_data.sh

Contrastive projection

If you're here just to know how we implemented contrastive projection, here it is:

u = classifier_w[fact_idx] - classifier_w[foil_idx]
contrastive_projection = np.outer(u, u) / np.dot(u, u)

Very simple :)

contrastive_projection is a projection matrix that projects the model's latent representation of some example h into the direction of h that separates the logits of the fact and foil.

Training MNLI/BIOS models

bash scripts/train_sequence_classification.sh

Highlight ranking (Sections 4.3, 5.3)

Run the notebooks/mnli-highlight-featurerank.ipynb or notebooks/bios-highlight-featurerank.ipynb jupyter notebooks.

These notebooks load the respective models, and then run the highlight ranking procedure.

Foil ranking (Section 4.1)

First, cache the model's encodings of the dev set examples:

bash scripts/cache_encodings_bios.sh

Then run the notebooks/bios-highlight-foilrank.ipynb notebook.

Contrastive decision making (Section 4.4)

First, cache the model's encodings of the dev set examples (skip if already executed):

bash scripts/cache_encodings_bios.sh

Then run the notebooks/bios-foilpower.ipynb notebook.

Foil ranking for BIOS concepts (Section 4.2)

First, generate concept labels as a numpy matrix from the BIOS dataset:

python scripts/bios_concepts.py --data-path data/bios/train.jsonl --concept-path experiments/models/bios/roberta-large/concepts/gender-male/train
python scripts/bios_concepts.py --data-path data/bios/dev.jsonl --concept-path experiments/models/bios/roberta-large/concepts/gender-male/dev
python scripts/bios_concepts.py --data-path data/bios/test.jsonl --concept-path experiments/models/bios/roberta-large/concepts/gender-male/test

Then, run Amnesic Probing:

WIP - to be added soon. Alternatively, refer to the original amnesic probing repository which has the necessary code.

Foil ranking for MNLI concepts (Section 5.2)

Overlap concept:

First, generate concept labels as a numpy matrix from the BIOS dataset:

python scripts/mnli_concepts.py --data-path data/mnli/train.jsonl --concept-path experiments/models/mnli/roberta-large/concepts/overlap/train
python scripts/mnli_concepts.py --data-path data/mnli/dev.jsonl --concept-path experiments/models/mnli/roberta-large/concepts/overlap/dev
python scripts/mnli_concepts.py --data-path data/mnli/test.jsonl --concept-path experiments/models/mnli/roberta-large/concepts/overlap/test

Then, run Amnesic Probing:

WIP - to be added soon. Alternatively, refer to the original amnesic probing repository which has the necessary code.

Negation concept:

The examples we used for the negation concept analysis are:

data/nli_negation_concept/entailment.jsonl  # entailment instances
data/nli_negation_concept/entailment_with_negation.jsonl  # the above entailment instances, paraphrased with negation words
data/nli_negation_concept/neutral.jsonl  # neutral instances
data/nli_negation_concept/neutral_with_negation.jsonl  # the above neutral instances, paraphrased with negation words

To analyze them with respect to the trained MultiNLI model, run the notebook notebooks/mnli-negation-foilrank.ipynb.

Explaining neural decisions contrastively to alternative decisions.

Related tags

Overview

Contrastive Explanations for Model Interpretability

This is the repository for the paper "Contrastive Explanations for Model Interpretability", about explaining neural model decisions against alternative decisions.

Authors: Alon Jacovi, Swabha Swayamdipta, Shauli Ravfogel, Yanai Elazar, Yejin Choi, Yoav Goldberg

Getting Started

Setup

Contrastive projection

Training MNLI/BIOS models

Highlight ranking (Sections 4.3, 5.3)

Foil ranking (Section 4.1)

Contrastive decision making (Section 4.4)

Foil ranking for BIOS concepts (Section 4.2)

Foil ranking for MNLI concepts (Section 5.2)

Overlap concept:

Negation concept:

Owner

AI2

Automatic Image Background Subtraction

🚀 An end-to-end ML applications using PyTorch, W&B, FastAPI, Docker, Streamlit and Heroku

FairMOT for Multi-Class MOT using YOLOX as Detector

Code for models used in Bashiri et al., "A Flow-based latent state generative model of neural population responses to natural images".

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

RRxIO - Robust Radar Visual/Thermal Inertial Odometry: Robust and accurate state estimation even in challenging visual conditions.

Code for Towards Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games

TensorFlow-based implementation of "Pyramid Scene Parsing Network".

MoCap-Solver: A Neural Solver for Optical Motion Capture Data

A MNIST-like fashion product database. Benchmark

Implementation of Feedback Transformer in Pytorch

SAS: Self-Augmentation Strategy for Language Model Pre-training

QR2Pass-project - A proof of concept for an alternative (passwordless) authentication system to a web server

NumQMBasic - A mini-course offered to Undergrad physics students

Graph parsing approach to structured sentiment analysis.

[NeurIPS 2021] Official implementation of paper "Learning to Simulate Self-driven Particles System with Coordinated Policy Optimization".

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

DeepLabv3+：Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

Implementation of Gans