Contrastive Explanation (Foil Trees), developed at TNO/Utrecht University

Last update: Aug 29, 2022

Overview

Contrastive Explanation (Foil Trees)

Contrastive and counterfactual explanations for machine learning (ML)

Marcel Robeer (2018-2020), TNO/Utrecht University

Introduction
Publications: citing this package
Example usage
Documentation: choices for problem explanation
License

Introduction

Contrastive Explanation provides an explanation for why an instance had the current outcome (fact) rather than a targeted outcome of interest (foil). These counterfactual explanations limit the explanation to the features relevant in distinguishing fact from foil, thereby disregarding irrelevant features. The idea of contrastive explanations is captured in this Python package ContrastiveExplanation. Example facts and foils are:

Machine Learning (ML) type	Problem	Explainable AI (XAI) question	Fact	Foil
Classification	Determine type of animal	Why is this instance a cat rather than a dog?	Cat	Dog
Regression analysis	Predict students' grade	Why is the predicted grade for this student 6.5 rather than higher?	6.5	More than 6.5
Clustering	Find similar flowers	Why is this flower in cluster 1 rather than cluster 4?	Cluster 1	Cluster 4

Publications

One scientific paper was published on Contrastive Explanation / Foil Trees:

J. van der Waa, M. Robeer, J. van Diggelen, M. Brinkhuis, and M. Neerincx, "Contrastive Explanations with Local Foil Trees", in 2018 Workshop on Human Interpretability in Machine Learning (WHI 2018), 2018, pp. 41-47. [Online]. Available: http://arxiv.org/abs/1806.07470

It was developed as part of a Master's Thesis at Utrecht University / TNO:

M. Robeer, "Contrastive Explanation for Machine Learning", MSc Thesis, Utrecht University, 2018. [Online]. Available: https://dspace.library.uu.nl/handle/1874/368081

Citing this package

@inproceedings{vanderwaa2018,
  title={{Contrastive Explanations with Local Foil Trees}},
  author={van der Waa, Jasper and Robeer, Marcel and van Diggelen, Jurriaan and Brinkhuis, Matthieu and Neerincx, Mark},
  booktitle={2018 Workshop on Human Interpretability in Machine Learning (WHI)},
  year={2018}
}

Example usage

As a simple example, let us explain a Random Forest classifier that determine the type of flower in the well-known Iris flower classification problem. The data set comprises 150 instances, each one of three types of flowers (setosa, versicolor and virginica). For each instance, the data set includes four features (sepal length, sepal width, petal length, petal width) and the goal is to determine which type of flower (class) each instance is.

Steps

First, train the 'black-box' model to explain

from sklearn import datasets, model_selection, ensemble
seed = 1

# Train black-box model on Iris data
data = datasets.load_iris()
train, test, y_train, y_test = model_selection.train_test_split(data.data, 
                                                                data.target, 
                                                                train_size=0.80, 
                                                                random_state=seed)
model = ensemble.RandomForestClassifier(random_state=seed)
model.fit(train, y_train)

Next, perform contrastive explanation on the first test instance (test[0]) by wrapping the tabular data in a DomainMapper, and then using method ContrastiveExplanation.explain_instance_domain()

# Contrastive explanation
import contrastive_explanation as ce

dm = ce.domain_mappers.DomainMapperTabular(train, 
                                           feature_names=data.feature_names,
					   contrast_names=data.target_names)
exp = ce.ContrastiveExplanation(dm, verbose=True)

sample = test[0]
exp.explain_instance_domain(model.predict_proba, sample)

[OUT] "The model predicted 'setosa' instead of 'versicolor' because 'sepal length (cm) <= 6.517 and petal width (cm) <= 0.868'"

The predicted class using the RandomForestClassifier was 'setosa', while the second most probable class 'versicolor' may have been expected instead. The difference of why the current instance was classified 'setosa' is because its sepal length is less than or equal to 6.517 centimers and its petal width is less than or equal to 0.868 centimers. In other words, if the instance would keep all feature values the same, but change its sepal width to more than 6.517 centimers and its petal width to more than 0.868 centimers, the black-box classifier would have changed the outcome to 'versicolor'.

More examples

For more examples, check out the attached Notebook.

Documentation

Several choices can be made to tailor the explanation to your type of explanation problem.

Choices for problem explanation

FactFoil

Used for determining the current outcome (fact) and the outcome of interest (foil), based on a foil_method (e.g. second most probable class, random class, greater than the current outcome). Foils can also be manually selected by using the foil=... optional argument of the ContrastiveExplanation.explain_instance_domain() method.

FactFoil	Description	foil_method
`FactFoilClassification` (default)	Determine fact and foil for classification/unsupervised learning	`second`, `random`
`FactFoilRegression`	Determine fact and foil for regression analysis	`greater`, `smaller`

Explanators

Method for forming the explanation, either using a Foil Tree (TreeExplanator) as described in the paper, or using a prototype (PointExplanator, not fully implemented). As multiple explanations hold, one can choose the foil_strategy as either 'closest' (shortest explanation), 'size' (move the current outcome to the area containing most samples of the foil outcome), 'impurity' (most informative foil area), or 'random' (random foil area)

Explanator	Description	foil_strategy
`TreeExplanator` (default)	Foil Tree: Explain using a decision tree	`closest`, `size`, `impurity`, `random`
`PointExplanator`	Explain with a representatitive point (prototype) of the foil class	`closest`, `medoid`, `random`

Domain Mappers

For handling the different types of data:

Tabular (rows and columns)
Images (rudimentary support)

Maps to a general format that the explanator can form the explanation in, and then maps the explanation back into this format. Ensures meaningful feature names.

DomainMapper	Description
`DomainMapperTabular`	Tabular data (columns with feature names, rows)
`DomainMapperPandas`	Uses a `pandas` dataframe to create a `DomainMapperTabular`, while automatically inferring feature names
`DomainMapperImage`	Image data

License

ContrastiveExplanation is BSD-3 Licensed.

Comments

Changing output every run

Hi,

When I run the exp.explain_instance_domain(model.predict_proba, sample) , I get different output everytime. After every run, a different column is given as output.

Is there a way I can get constant results?

opened by Subh1m 5
TypeError in the example notebook

running the example notebook (Contrastive explanation - example usage), the line exp.explain_instance_domain(model.predict_proba, sample) gives the following error:

opened by Naviden 3
Not working for multi-valued categorical features

Does the current implementation support only binary-valued categorical features?

Because I tried with the adult income dataset which has many multi-value categorical and continuous features (https://archive.ics.uci.edu/ml/datasets/adult) and got output like these:

"The model predicted '<=50k' instead of '>50k' because 'hours_per_week <= 42.832 and not occupation and age <= 34.95 and not education and hours_per_week <= 57.892'"

Here, education and occupation are not binary features - they have many levels.
bug enhancement

opened by raam93 3
Always getting the warning "UserWarning: Could not find a difference between fact...", with blank explenations - for any dataset, and every sample.

I am trying to exactly recreate the example from the README for the Iris-dataset. Unforunately, when running .explain_instance_domain(model.predict_proba, sample) I get the following output:

[F] Picked foil "1" using foil selection strategy "second" [D] Obtaining neighborhood data C:\Users\dsemkoandrosenko\contrastive_explanation\contrastive_explanation.py:264: UserWarning: Could not find a difference between fact "setosa" and foil "versicolor" warnings.warn(f'Could not find a difference between fact ' "The model predicted 'setosa' instead of 'versicolor' because ''"

I get the same issue with every single other sample, and even every other dataset I try. What could be the issue?

Versions: Windows 10 Python: 3.7.4 Scikit-Learn: 0.21.3 Numpy: 1.18.2
bug

opened by mlds2020 2
Create DOI on Zenodo

I wanted to ask if you considered to create a DOI for the Contrastive Explanation. This allows researchers to reference a version of the package with ease and can be done with Zenodo for example. There is also great integration between Zenodo, which creates a new DOI and a persistent copy of the repository for each release, and Github. You can find instructions on how to create a DOI here from official Github docs: https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content This step helps with visibility of this repository and therefore making your research software more used.

On a similar note, it also helps researchers to know how to correctly cite the software. I see that you already added this in the readme but Github also offers the citation file format which also shows up on the top right if used: You can find more information about it here: https://citation-file-format.github.io/

I would be happy to help with this if you have any questions.

opened by kequach 1
Trying to understand output

Hi

I have been trying your approach for a regression problem with categorical features. I receive an explanation in form:

The model predicted 123 because sales < 1445 and not month ( dummy example)

month is a categorical variable with values "1", "2", .. "12".

What does it mean " and not month" then ?

Thank you

opened by andreysharapov 0
Explanations of Clustering algorithms

Hi, I was wondering if the Clustering algorithm is supported or not. I see you mentioned it in the README but looking at the code, I can't find it anywhere. Thanks :)

opened by Naviden 0
Specify Features

Hi, thank you for the great package. I would like to know is there a way to specify which features to change? For example I would like to see what I need to change only for specific features?

Thank you
enhancement

opened by arsine1996 0

Releases(v0.2)

v0.2(Mar 4, 2022)

ContrastiveExplanation 0.2
Source code(tar.gz)
Source code(zip)

Owner

M.J. Robeer

GitHub Repository

Interpretability and explainability of data and machine learning models

AI Explainability 360 (v0.2.1) The AI Explainability 360 toolkit is an open-source library that supports interpretability and explainability of datase

1.2k Dec 29, 2022

Lucid library adapted for PyTorch

Lucent PyTorch + Lucid = Lucent The wonderful Lucid library adapted for the wonderful PyTorch! Lucent is not affiliated with Lucid or OpenAI's Clarity

520 Dec 26, 2022

pytorch implementation of "Distilling a Neural Network Into a Soft Decision Tree"

Soft-Decision-Tree Soft-Decision-Tree is the pytorch implementation of Distilling a Neural Network Into a Soft Decision Tree, paper recently published

262 Dec 04, 2022

Contrastive Explanation (Foil Trees), developed at TNO/Utrecht University

Contrastive Explanation (Foil Trees) Contrastive and counterfactual explanations for machine learning (ML) Marcel Robeer (2018-2020), TNO/Utrecht Univ

41 Aug 29, 2022

PyTorch implementation of DeepDream algorithm

neural-dream This is a PyTorch implementation of DeepDream. The code is based on neural-style-pt. Here we DeepDream a photograph of the Golden Gate Br

121 Nov 05, 2022

Portal is the fastest way to load and visualize your deep neural networks on images and videos 🔮

243 Jan 05, 2023

Neural network visualization toolkit for tf.keras

262 Dec 19, 2022

ModelChimp is an experiment tracker for Deep Learning and Machine Learning experiments.

ModelChimp What is ModelChimp? ModelChimp is an experiment tracker for Deep Learning and Machine Learning experiments. ModelChimp provides the followi

124 Dec 21, 2022

🎆 A visualization of the CapsNet layers to better understand how it works

CapsNet-Visualization For more information on capsule networks check out my Medium articles here and here. Setup Use pip to install the required pytho

387 Dec 06, 2022

A collection of infrastructure and tools for research in neural network interpretability.

Lucid Lucid is a collection of infrastructure and tools for research in neural network interpretability. We're not currently supporting tensorflow 2!

4.5k Jan 07, 2023

Making decision trees competitive with neural networks on CIFAR10, CIFAR100, TinyImagenet200, Imagenet

Neural-Backed Decision Trees · Site · Paper · Blog · Video Alvin Wan, *Lisa Dunlap, *Daniel Ho, Jihan Yin, Scott Lee, Henry Jin, Suzanne Petryk, Sarah

556 Dec 20, 2022

Logging MXNet data for visualization in TensorBoard.

Logging MXNet Data for Visualization in TensorBoard Overview MXBoard provides a set of APIs for logging MXNet data for visualization in TensorBoard. T

327 Dec 05, 2022

A Practical Debugging Tool for Training Deep Neural Networks

Cockpit is a visual and statistical debugger specifically designed for deep learning!

31 Aug 14, 2022

A ultra-lightweight 3D renderer of the Tensorflow/Keras neural network architectures

16 Nov 17, 2021

TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, Korean, Chinese, German and Easy to adapt for other languages)

🤪 TensorFlowTTS provides real-time state-of-the-art speech synthesis architectures such as Tacotron-2, Melgan, Multiband-Melgan, FastSpeech, FastSpeech2 based-on TensorFlow 2. With Tensorflow 2, we c

3k Jan 04, 2023

Many Class Activation Map methods implemented in Pytorch for CNNs and Vision Transformers. Including Grad-CAM, Grad-CAM++, Score-CAM, Ablation-CAM and XGrad-CAM

Class Activation Map methods implemented in Pytorch pip install grad-cam ⭐ Comprehensive collection of Pixel Attribution methods for Computer Vision.

6.5k Jan 01, 2023

Contrastive Explanation (Foil Trees), developed at TNO/Utrecht University

Related tags

Overview

Contrastive Explanation (Foil Trees)

Contents

Introduction

Publications

Citing this package

Example usage

Steps

More examples

Documentation

Choices for problem explanation

FactFoil

Explanators

Domain Mappers

License

Comments

Releases(v0.2)

v0.2(Mar 4, 2022)

Owner

M.J. Robeer

Interpretability and explainability of data and machine learning models

Lucid library adapted for PyTorch

pytorch implementation of "Distilling a Neural Network Into a Soft Decision Tree"

Contrastive Explanation (Foil Trees), developed at TNO/Utrecht University

PyTorch implementation of DeepDream algorithm

Portal is the fastest way to load and visualize your deep neural networks on images and videos 🔮

Neural network visualization toolkit for tf.keras

ModelChimp is an experiment tracker for Deep Learning and Machine Learning experiments.

🎆 A visualization of the CapsNet layers to better understand how it works

A collection of infrastructure and tools for research in neural network interpretability.

Making decision trees competitive with neural networks on CIFAR10, CIFAR100, TinyImagenet200, Imagenet

Logging MXNet data for visualization in TensorBoard.

A Practical Debugging Tool for Training Deep Neural Networks

A ultra-lightweight 3D renderer of the Tensorflow/Keras neural network architectures

TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, Korean, Chinese, German and Easy to adapt for other languages)

Many Class Activation Map methods implemented in Pytorch for CNNs and Vision Transformers. Including Grad-CAM, Grad-CAM++, Score-CAM, Ablation-CAM and XGrad-CAM

Python Library for Model Interpretation/Explanations

Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

Quickly and easily create / train a custom DeepDream model

Lime: Explaining the predictions of any machine learning classifier