Which Style Makes Me Attractive? Interpretable Control Discovery and Counterfactual Explanation on StyleGAN

Last update: Dec 01, 2022

Overview

Interpretable Control Exploration and Counterfactual Explanation (ICE) on StyleGAN

Which Style Makes Me Attractive? Interpretable Control Discovery and Counterfactual Explanation on StyleGAN

Bo Li, Qiulin Wang, Jiquan Pei, Yu Yang, Xiangyang Ji

Abstract: The semantically disentangled latent subspace in GAN provides rich interpretable controls in image generation. This paper includes two contributions on semantic latent subspace analysis in the scenario of face generation using StyleGAN2. First, we propose a novel approach to disentangle latent subspace semantics by exploiting existing face analysis models, e.g., face parsers and face landmark detectors. These models provide the flexibility to construct various criterions with very concrete and interpretable semantic meanings (e.g., change face shape or change skin color) to restrict latent subspace disentanglement. Rich latent space controls unknown previously can be discovered using the constructed criterions. Second, we propose a new perspective to explain the behavior of a CNN classifier by generating counterfactuals in the interpretable latent subspaces we discovered. This explanation helps reveal whether the classifier learns semantics as intended. Experiments on various disentanglement criterions demonstrate the effectiveness of our approach. We believe this approach contributes to both areas of image manipulation and counterfactual explainability of CNNs.

The code is developed on NVlabs/stylegan2-ada-pytorch and put in the ice folder. Please play with the two ipython notebooks.

ice/discover_subspaces

Solve subspaces by using face analysis models as criterions. Currently we only include several representative subspaces. The notebook requires to download some pre-trained models. You might have to spend some efforts to put everything at the right place. See the notebook comments for details. This notebook shows the code sketch to generate Figure 3 (as below) in the paper, i.e., the latent subspace for interpretable face manipulation.

ice/explain_counterfactually

Use the interpretable subspaces discovered by the above notebook to explain the classifier of attractiveness. This notebook shows the code sketch to generate Figure 4 (as below) in the paper, i.e., the interpretable counterfactuals to increase attractiveness score of a given classifier. Since we did not find good public pre-trained model. The attractiveness classifier is trained by ourselves using d-li14/face-attribute-prediction.

Which Style Makes Me Attractive? Interpretable Control Discovery and Counterfactual Explanation on StyleGAN

Related tags

Overview

Interpretable Control Exploration and Counterfactual Explanation (ICE) on StyleGAN

Owner

Bo Li

Code for TIP 2017 paper --- Illumination Decomposition for Photograph with Multiple Light Sources.

Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features

A graphical Semi-automatic annotation tool based on labelImg and Yolov5

A curated list of automated deep learning (including neural architecture search and hyper-parameter optimization) resources.

Code for CPM-2 Pre-Train

Bayesian Neural Networks in PyTorch

Unsupervised Representation Learning via Neural Activation Coding

It is a simple library to speed up CLIP inference up to 3x (K80 GPU)

Keywords : Streamlit, BertTokenizer, BertForMaskedLM, Pytorch

Topic Modelling for Humans

CLDF dataset derived from Robbeets et al.'s "Triangulation Supports Agricultural Spread" from 2021

Optical Character Recognition + Instance Segmentation for russian and english languages

This repository is the official implementation of Open Rule Induction. This paper has been accepted to NeurIPS 2021.

OSLO: Open Source framework for Large-scale transformer Optimization

Collective Multi-type Entity Alignment Between Knowledge Graphs (WWW'20)

MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks

OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.

DeepGNN is a framework for training machine learning models on large scale graph data.

Aircraft design optimization made fast through modern automatic differentiation

Code for "R-GCN: The R Could Stand for Random"