Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Overview

Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Mohamad Shahbazi, Martin Danelljan, Danda P. Paudel, Luc Van Gool
Paper: https://openreview.net/forum?id=7TZeCsNOUB_

Teaser image

Abstract

Class-conditioning offers a direct means of controlling a Generative Adversarial Network (GAN) based on a discrete input variable. While necessary in many applications, the additional information provided by the class labels could even be expected to benefit the training of the GAN itself. Contrary to this belief, we observe that class-conditioning causes mode collapse in limited data settings, where unconditional learning leads to satisfactory generative ability. Motivated by this observation, we propose a training strategy for conditional GANs (cGANs) that effectively prevents the observed mode-collapse by leveraging unconditional learning. Our training strategy starts with an unconditional GAN and gradually injects conditional information into the generator and the objective function. The proposed method for training cGANs with limited data results not only in stable training but also in generating high-quality images, thanks to the early-stage exploitation of the shared information across classes. We analyze the aforementioned mode collapse problem in comprehensive experiments on four datasets. Our approach demonstrates outstanding results compared with state-of-the-art methods and established baselines.

Overview

  1. Requirements
  2. Getting Started
  3. Dataset Prepration
  4. Training
  5. Evaluation and Logging
  6. Contact
  7. How to Cite

Requirements

  • Linux and Windows are supported, but Linux is recommended for performance and compatibility reasons.
  • For the batch size of 64, we have used 4 NVIDIA GeForce RTX 2080 Ti GPUs (each having 11 GiB of memory).
  • 64-bit Python 3.7 and PyTorch 1.7.1. See https://pytorch.org/ for PyTorch installation instructions.
  • CUDA toolkit 11.0 or later. Use at least version 11.1 if running on RTX 3090. (Why is a separate CUDA toolkit installation required? See comments of this Github issue.)
  • Python libraries: pip install wandb click requests tqdm pyspng ninja imageio-ffmpeg==0.4.3.
  • This project uses Weights and Biases for visualization and logging. In addition to installing W&B (included in the command above), you need to create a free account on W&B website. Then, you must login to your account in the command line using the command ‍‍‍wandb login (The login information will be asked after running the command).
  • Docker users: use the provided Dockerfile by StyleGAN2+ADA (./Dockerfile) to build an image with the required library dependencies.

The code relies heavily on custom PyTorch extensions that are compiled on the fly using NVCC. On Windows, the compilation requires Microsoft Visual Studio. We recommend installing Visual Studio Community Edition and adding it into PATH using "C:\Program Files (x86)\Microsoft Visual Studio\ \Community\VC\Auxiliary\Build\vcvars64.bat" .

Getting Started

The code for this project is based on the Pytorch implementation of StyleGAN2+ADA. Please first read the instructions provided for StyleGAN2+ADA. Here, we mainly provide the additional details required to use our method.

For a quick start, we have provided example scripts in ./scripts, as well as an example dataset (a tar file containing a subset of ImageNet Carnivores dataset used in the paper) in ./datasets. Note that the scripts do not include the command for activating python environments. Moreover, the paths for the dataset and output directories can be modified in the scripts based on your own setup.

The following command runs a script that extracts the tar file and creates a ZIP file in the same directory.

bash scripts/prepare_dataset_ImageNetCarnivores_20_100.sh

The ZIP file is later used for training and evaluation. For more details on how to use your custom datasets, see Dataset Prepration.

Following command runs a script that trains the model using our method with default hyper-parameters:

bash scripts/train_ImageNetCarnivores_20_100.sh

For more details on how to use your custom datasets, see Training

To calculate the evaluation metrics on a pretrained model, use the following command:

bash scripts/inference_metrics_ImageNetCarnivores_20_100.sh

Outputs from the training and inferenve commands are by default placed under out/, controlled by --outdir. Downloaded network pickles are cached under $HOME/.cache/dnnlib, which can be overridden by setting the DNNLIB_CACHE_DIR environment variable. The default PyTorch extension build directory is $HOME/.cache/torch_extensions, which can be overridden by setting TORCH_EXTENSIONS_DIR.

Dataset Prepration

Datasets are stored as uncompressed ZIP archives containing uncompressed PNG files and a metadata file dataset.json for labels.

Custom datasets can be created from a folder containing images (each sub-directory containing images of one class in case of multi-class datasets) using dataset_tool.py; Here is an example of how to convert the dataset folder to the desired ZIP file:

python dataset_tool.py --source=datasets/ImageNet_Carnivores_20_100 --dest=datasets/ImageNet_Carnivores_20_100.zip --transform=center-crop --width=128 --height=128

The above example reads the images from the image folder provided by --src, resizes the images to the sizes provided by --width and --height, and applys the transform center-crop to them. The resulting images along with the metadata (label information) are stored as a ZIP file determined by --dest. see python dataset_tool.py --help for more information. See StyleGAN2+ADA instructions for more details on specific datasets or Legacy TFRecords datasets .

The created ZIP file can be passed to the training and evaluation code using --data argument.

Training

Training new networks can be done using train.py. In order to perform the training using our method, the argument --cond should be set to 1, so that the training is done conditionally. In addition, the start and the end of the transition from unconditional to conditional training should be specified using the arguments t_start_kimg and --t_end_kimg. Here is an example training command:

python train.py --outdir=./out/ \
--data=datasets/ImageNet_Carnivores_20_100.zip \
--cond=1 --t_start_kimg=2000  --t_end_kimg=4000  \
--gpus=4 \
--cfg=auto --mirror=1 \
--metrics=fid50k_full,kid50k_full

See StyleGAN2+ADA instructions for more details on the arguments, configurations amd hyper-parammeters. Please refer to python train.py --help for the full list of arguments.

Note: Our code currently can be used only for unconditional or transitional training. For the original conditional training, you can use the original implementation StyleGAN2+ADA.

Evaluation and Logging

By default, train.py automatically computes FID for each network pickle exported during training. More metrics can be added to the argument --metrics (as a comma-seperated list). To monitor the training, you can inspect the log.txt an JSON files (e.g. metric-fid50k_full.jsonl for FID) saved in the ouput directory. Alternatively, you can inspect WandB or Tensorboard logs (By default, WandB creates the logs under the project name "Transitional-cGAN", which can be accessed in your account on the website).

When desired, the automatic computation can be disabled with --metrics=none to speed up the training slightly (3%–9%). Additional metrics can also be computed after the training:

# Previous training run: look up options automatically, save result to JSONL file.
python calc_metrics.py --metrics=pr50k3_full \
    --network=~/training-runs/00000-ffhq10k-res64-auto1/network-snapshot-000000.pkl

# Pre-trained network pickle: specify dataset explicitly, print result to stdout.
python calc_metrics.py --metrics=fid50k_full --data=~/datasets/ffhq.zip --mirror=1 \
    --network=https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl

The first example looks up the training configuration and performs the same operation as if --metrics=pr50k3_full had been specified during training. The second example downloads a pre-trained network pickle, in which case the values of --mirror and --data must be specified explicitly.

See StyleGAN2+ADA instructions for more details on the available metrics.

Contact

For any questions, suggestions, or issues with the code, please contact Mohamad Shahbazi at [email protected]

How to Cite

@inproceedings{
shahbazi2022collapse,
title={Collapse by Conditioning: Training Class-conditional {GAN}s with Limited Data},
author={Shahbazi, Mohamad and Danelljan, Martin and Pani Paudel, Danda and Van Gool, Luc},
booktitle={The Tenth International Conference on Learning Representations },
year={2022},
url={https://openreview.net/forum?id=7TZeCsNOUB_}
Owner
Mohamad Shahbazi
Ph.D. student at Computer Vision Lab, ETH Zurich || Interested in Machine Learning and its Applications in Computer Vision, NLP and Healthcare
Mohamad Shahbazi
A multilingual version of MS MARCO passage ranking dataset

mMARCO A multilingual version of MS MARCO passage ranking dataset This repository presents a neural machine translation-based method for translating t

75 Dec 27, 2022
InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing Figure: High-quality facial attributes editing results with InterFaceGA

GenForce: May Generative Force Be with You 1.3k Jan 09, 2023
Pytorch implementation of One-Shot Affordance Detection

One-shot Affordance Detection PyTorch implementation of our one-shot affordance detection models. This repository contains PyTorch evaluation code, tr

46 Dec 12, 2022
Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning

structshot Code and data for paper "Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning", Yi Yang and Arz

ASAPP Research 47 Dec 27, 2022
Retinal vessel segmentation based on GT-UNet

Retinal vessel segmentation based on GT-UNet Introduction This project is a retinal blood vessel segmentation code based on UNet-like Group Transforme

Kent0n 27 Dec 18, 2022
Code base of object detection

rmdet code base of object detection. 环境安装: 1. 安装conda python环境 - `conda create -n xxx python=3.7/3.8` - `conda activate xxx` 2. 运行脚本,自动安装pytorch1

3 Mar 08, 2022
A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

Confluence: A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection 1. 介绍 用以替代 NMS,在所有 bbox 中挑选出最优的集合。 NMS 仅考虑了 bbox 的得分,然后根据 IOU 来

44 Sep 15, 2022
Implementation of " SESS: Self-Ensembling Semi-Supervised 3D Object Detection" (CVPR2020 Oral)

SESS: Self-Ensembling Semi-Supervised 3D Object Detection Created by Na Zhao from National University of Singapore Introduction This repository contai

125 Dec 23, 2022
Cleaned up code for DSTC 10: SIMMC 2.0 track: subtask 2: multimodal coreference resolution

UNITER-Based Situated Coreference Resolution with Rich Multimodal Input: arXiv MMCoref_cleaned Code for the MMCoref task of the SIMMC 2.0 dataset. Pre

Yichen (William) Huang 2 Dec 05, 2022
Implementation of Barlow Twins paper

barlowtwins PyTorch Implementation of Barlow Twins paper: Barlow Twins: Self-Supervised Learning via Redundancy Reduction This is currently a work in

IgorSusmelj 86 Dec 20, 2022
[ECCV'20] Convolutional Occupancy Networks

Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page | Blog Post This repository contains the implementation o

622 Dec 30, 2022
The code for the CVPR 2021 paper Neural Deformation Graphs, a novel approach for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects.

Neural Deformation Graphs Project Page | Paper | Video Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction Aljaž Božič, Pablo P

Aljaz Bozic 134 Dec 16, 2022
(AAAI2020)Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing

Grapy-ML: Graph Pyramid Mutual Learning for Cross-dataset Human Parsing This repository contains pytorch source code for AAAI2020 oral paper: Grapy-ML

54 Aug 04, 2022
Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks

This is the code associated with the paper Predicting Semantic Map Representations from Images with Pyramid Occupancy Networks, published at CVPR 2020.

Thomas Roddick 219 Dec 20, 2022
Tilted Empirical Risk Minimization (ICLR '21)

Tilted Empirical Risk Minimization This repository contains the implementation for the paper Tilted Empirical Risk Minimization ICLR 2021 Empirical ri

Tian Li 40 Nov 28, 2022
✨✨✨An awesome open source toolbox for stereo matching.

OpenStereo This is an awesome open source toolbox for stereo matching. Supported Methods: BM SGM(T-PAMI'07) GCNet(ICCV'17) PSMNet(CVPR'18) StereoNet(E

Wang Qingyu 6 Nov 04, 2022
Large scale PTM - PPI relation extraction

Large-scale protein-protein post-translational modification extraction with distant supervision and confidence calibrated BioBERT The silver standard

1 Feb 25, 2022
[MedIA2021]MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from Medical Images Using Deep Learning

MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from Medical Images Using Deep Learning [MedIA or Arxiv] and [Demo] This repository pr

Healthcare Intelligence Laboratory 92 Dec 08, 2022
Zalo AI challenge 2021 task hum to song

Zalo AI challenge 2021 task Hum to Song pipeline: Chuẩn bị dữ liệu cho quá trình train: Sửa các file đường dẫn trong config/preprocess.yaml raw_path:

Vo Van Phuc 105 Dec 16, 2022
Pairwise model for commonlit competition

Pairwise model for commonlit competition To run: - install requirements - create input directory with train_folds.csv and other competition data - cd

abhishek thakur 45 Aug 31, 2022