Toward a Visual Concept Vocabulary for GAN Latent Space, ICCV 2021

Last update: Dec 23, 2022

Related tags

Overview

Toward a Visual Concept Vocabulary for GAN Latent Space
_{Code and data from the ICCV 2021 paper}

Sarah Schwettmann, Evan Hernandez, David Bau, Samuel Klein, Jacob Andreas, Antonio Torralba
Paper | Website | arxiv

This repository contains code for finding layer-selective directions, distilling them, and loading the vocabulary of visual concepts in BigGAN used in the original paper.

Notice: This repository is under active development! Expect instability until at least October 25th, 2021.

Installation

The provided code has been tested for Python 3.8 on MacOS and Ubuntu 20.04. It may still work in other environments, but we make no guarantees.

To run the code yourself, start by cloning the repository:

git clone https://github.com/schwettmann/visual-vocab
cd visual-vocab

(Optional) You will probably want to create a conda environment or virtual environment instead of installing the dependencies globally. E.g., to create a new virtual environment you can run:

python3 -m venv env
source env/bin/activate

Finally, install the Python dependencies using pip:

pip3 install -r requirements.txt

Usage

Notice: This section is under construction and will be updated as functionality gets added.

To download any of the various annotated directions from the paper, use datasets.load submodule. It downloads and parses the annoated directions. Example usage:

from visualvocab import datasets

# Download layer-selective directions and annotations used for distilling single-word directions:
dataset = datasets.load('lsd_all')

# Download distilled directions for all BigGAN-Places365 categories:
dataset = datasets.load('distilled_all')

# Download distilled directions for a specific BigGAN-Places365 category:
dataset = datasets.load('distilled_cottage')

See the module for a full list of available annotated directions.

Citation

Sarah Schwettmann, Evan Hernandez, David Bau, Samuel Klein, Jacob Andreas, Antonio Torralba. Toward a Visual Concept Vocabulary for GAN Latent Space, Proceedings of the International Conference on Computer Vision (ICCV), 2021.

Bibtex

@InProceedings{Schwettmann_2021_ICCV,
    author    = {Schwettmann, Sarah and Hernandez, Evan and Bau, David and Klein, Samuel and Andreas, Jacob and Torralba, Antonio},
    title     = {Toward a Visual Concept Vocabulary for GAN Latent Space},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {6804-6812}
}

Toward a Visual Concept Vocabulary for GAN Latent Space, ICCV 2021

Related tags

Overview

Toward a Visual Concept Vocabulary for GAN Latent Space
_{Code and data from the ICCV 2021 paper}

Installation

Usage

Citation

Bibtex

Owner

Sarah Schwettmann

Natural Language Processing with transformers

Word Bot for JKLM Bomb Party

Code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

GAP-text2SQL: Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

iBOT: Image BERT Pre-Training with Online Tokenizer

Neural network sequence labeling model

Pipelines de datos, 2021.

Learning Spatio-Temporal Transformer for Visual Tracking

Toy example of an applied ML pipeline for me to experiment with MLOps tools.

A linter to manage all your python exceptions and try/except blocks (limited only for those who like dinosaurs).

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

This is an incredibly powerful calculator that is capable of many useful day-to-day functions.

Seq2seq attn - Use the Seq2Seq method to implement machine translation and introduce Attention mechanism to improve the results

Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

This is the Alpha of Nutte language, she is not complete yet / Essa é a Alpha da Nutte language, não está completa ainda

Pretrain CPM - 大规模预训练语言模型的预训练代码

An easy to use, user-friendly and efficient code for extracting OpenAI CLIP (Global/Grid) features from image and text respectively.

Code repository for "It's About Time: Analog clock Reading in the Wild"

Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!

Toward a Visual Concept Vocabulary for GAN Latent Space, ICCV 2021

Related tags

Overview

Toward a Visual Concept Vocabulary for GAN Latent Space Code and data from the ICCV 2021 paper

Installation

Usage

Citation

Bibtex

Owner

Sarah Schwettmann

Natural Language Processing with transformers

Word Bot for JKLM Bomb Party

Code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation".

GAP-text2SQL: Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

iBOT: Image BERT Pre-Training with Online Tokenizer

Neural network sequence labeling model

Pipelines de datos, 2021.

Learning Spatio-Temporal Transformer for Visual Tracking

Toy example of an applied ML pipeline for me to experiment with MLOps tools.

A linter to manage all your python exceptions and try/except blocks (limited only for those who like dinosaurs).

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

This is an incredibly powerful calculator that is capable of many useful day-to-day functions.

Seq2seq attn - Use the Seq2Seq method to implement machine translation and introduce Attention mechanism to improve the results

Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

This is the Alpha of Nutte language, she is not complete yet / Essa é a Alpha da Nutte language, não está completa ainda

Pretrain CPM - 大规模预训练语言模型的预训练代码

An easy to use, user-friendly and efficient code for extracting OpenAI CLIP (Global/Grid) features from image and text respectively.

Code repository for "It's About Time: Analog clock Reading in the Wild"

Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!

Toward a Visual Concept Vocabulary for GAN Latent Space
_{Code and data from the ICCV 2021 paper}