[NeurIPS'21] Projected GANs Converge Faster

Overview

[Project] [PDF] [Supplementary] [Talk]

This repository contains the code for our NeurIPS 2021 paper "Projected GANs Converge Faster"

by Axel Sauer, Kashyap Chitta, Jens MΓΌller, and Andreas Geiger.

If you find our code or paper useful, please cite

@InProceedings{Sauer2021NEURIPS,
  author         = {Axel Sauer and Kashyap Chitta and Jens M{\"{u}}ller and Andreas Geiger},
  title          = {Projected GANs Converge Faster},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
  year           = {2021},
}

ToDos

  • Initial code release
  • Providing pretrained models
  • Easy-to-use colab
  • StyleGAN3 support

Requirements

  • 64-bit Python 3.8 and PyTorch 1.9.0 (or later). See https://pytorch.org for PyTorch install instructions.
  • Use the following commands with Miniconda3 to create and activate your PG Python environment:
    • conda env create -f environment.yml
    • conda activate pg
  • The StyleGAN2 generator relies on custom CUDA kernels, which are compiled on the fly. Hence you need:
    • CUDA toolkit 11.1 or later.
    • GCC 7 or later compilers. Recommended GCC version depends on CUDA version, see for example CUDA 11.4 system requirements.
    • If you run into problems when setting up for the custom CUDA kernels, we refer to the Troubleshooting docs of the original StyleGAN repo. When using the FastGAN generator you will not need the custom kernels.

Data Preparation

For a quick start, you can download the few-shot datasets provided by the authors of FastGAN. You can download them here. To prepare the dataset at the respective resolution, run for example

python dataset_tool.py --source=./data/pokemon --dest=./data/pokemon256.zip \
  --resolution=256x256 --transform=center-crop

You can get the datasets we used in our paper at their respective websites:

CLEVR, FFHQ, Cityscapes, LSUN, AFHQ, Landscape.

Training

Training your own PG on LSUN church using 8 GPUs:

python train.py --outdir=./training-runs/ --cfg=fastgan --data=./data/pokemon256.zip \
  --gpus=8 --batch=64 --mirror=1 --snap=50 --batch-gpu=8 --kimg=10000

--batch specifies the overall batch size, --batch-gpu specifies the batch size per GPU. If you use fewer GPUs, the training loop will automatically accumulate gradients, until the overall batch size is reached.

If you want to use the StyleGAN2 generator, use --cfg=stylegan2. Samples and metrics are saved in outdir. To monitor the training progress, you can inspect fid50k_full.json or run tensorboard in training-runs.

Generating Samples & Interpolations

To generate samples and interpolation videos, run

python gen_images.py --outdir=out --trunc=1.0 --seeds=10-15 \
  --network=PATH_TO_NETWORK_PKL

and

python gen_video.py --output=lerp.mp4 --trunc=1.0 --seeds=0-31 --grid=4x2 \
  --network=PATH_TO_NETWORK_PKL

Quality Metrics

Per default, train.py tracks FID50k during training. To calculate metrics for a specific network snapshot, run

python calc_metrics.py --metrics=fid50k_full --network=PATH_TO_NETWORK_PKL

To see the available metrics, run

python calc_metrics.py --help

Using PG in your own project

Our implementation is modular, so it is straightforward to use PG in your own codebase. Simply copy the pg_modules folder to your project. Then, to get the projected multi-scale discriminator, run

from pg_modules.discriminator import ProjectedDiscriminator
D = ProjectedDiscriminator()

The only thing you still need to do is to make sure that the feature network is not trained, i.e., explicitly set

D.feature_network.requires_grad_(False)

in your training loop.

Acknowledgments

Our codebase build and extends the awesome StyleGAN2-ADA repo and StyleGAN3 repo, both by Karras et al.

Furthermore, we use parts of the code of FastGAN and MiDas.

Cl datasets - PyTorch image dataloaders and utility functions to load datasets for supervised continual learning

Continual learning datasets Introduction This repository contains PyTorch image

berjaoui 5 Aug 28, 2022
Covid-19 Test AI (Deep Learning - NNs) Software. Accuracy is the %96.5, loss is the 0.09 :)

Covid-19 Test AI (Deep Learning - NNs) Software I developed a segmentation algorithm to understand whether Covid-19 Test Photos are positive or negati

Emirhan BULUT 28 Dec 04, 2021
Serving PyTorch 1.0 Models as a Web Server in C++

Serving PyTorch Models in C++ This repository contains various examples to perform inference using PyTorch C++ API. Run git clone https://github.com/W

Onur Kaplan 223 Jan 04, 2023
Contains code for the paper "Vision Transformers are Robust Learners".

Vision Transformers are Robust Learners This repository contains the code for the paper Vision Transformers are Robust Learners by Sayak Paul* and Pin

Sayak Paul 103 Jan 05, 2023
Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Antoine Caillon 589 Jan 02, 2023
RGB-stacking πŸ›‘ 🟩 πŸ”· for robotic manipulation

RGB-stacking πŸ›‘ 🟩 πŸ”· for robotic manipulation BLOG | PAPER | VIDEO Beyond Pick-and-Place: Tackling Robotic Stacking of Diverse Shapes, Alex X. Lee*,

DeepMind 95 Dec 23, 2022
Official Implementation of "LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks"

LUNAR Official Implementation of "LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks" Adam Goodge, Bryan Hooi, Ng See Kiong and

Adam Goodge 25 Dec 28, 2022
ManipNet: Neural Manipulation Synthesis with a Hand-Object Spatial Representation - SIGGRAPH 2021

ManipNet: Neural Manipulation Synthesis with a Hand-Object Spatial Representation - SIGGRAPH 2021 Dataset Code Demos Authors: He Zhang, Yuting Ye, Tak

HE ZHANG 194 Dec 06, 2022
Implementation of CVPR'21: RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

RfD-Net [Project Page] [Paper] [Video] RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction Yinyu Nie, Ji Hou, Xiaoguang Han, Matthi

Yinyu Nie 162 Jan 06, 2023
Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training of neural networks"

Train longer, generalize better - Big batch training This is a code repository used to generate the results appearing in "Train longer, generalize bet

Elad Hoffer 145 Sep 16, 2022
A state-of-the-art semi-supervised method for image recognition

Mean teachers are better role models Paper ---- NIPS 2017 poster ---- NIPS 2017 spotlight slides ---- Blog post By Antti Tarvainen, Harri Valpola (The

Curious AI 1.4k Jan 06, 2023
GPU-accelerated Image Processing library using OpenCL

pyclesperanto pyclesperanto is a python package for clEsperanto - a multi-language framework for GPU-accelerated image processing. clEsperanto uses Op

17 Dec 25, 2022
Solving SMPL/MANO parameters from keypoint coordinates.

Minimal-IK A simple and naive inverse kinematics solver for MANO hand model, SMPL body model, and SMPL-H body+hand model. Briefly, given joint coordin

Yuxiao Zhou 305 Dec 30, 2022
Behind the Curtain: Learning Occluded Shapes for 3D Object Detection

Behind the Curtain: Learning Occluded Shapes for 3D Object Detection Acknowledgement We implement our model, BtcDet, based on [OpenPcdet 0.3.0]. Insta

Qiangeng Xu 163 Dec 19, 2022
An implementation of the methods presented in Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data.

An implementation of the methods presented in Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data.

Andrew Jesson 9 Apr 04, 2022
A Simple Example for Imitation Learning with Dataset Aggregation (DAGGER) on Torcs Env

Imitation Learning with Dataset Aggregation (DAGGER) on Torcs Env This repository implements a simple algorithm for imitation learning: DAGGER. In thi

Hao 66 Nov 23, 2022
Towards Improving Embedding Based Models of Social Network Alignment via Pseudo Anchors

PSML paper: Towards Improving Embedding Based Models of Social Network Alignment via Pseudo Anchors PSML_IONE,PSML_ABNE,PSML_DEEPLINK,PSML_SNNA: numpy

13 Nov 27, 2022
Final report with code for KAIST Course KSE 801.

Orthogonal collocation is a method for the numerical solution of partial differential equations

Chuanbo HUA 4 Apr 06, 2022
Sound and Cost-effective Fuzzing of Stripped Binaries by Incremental and Stochastic Rewriting

StochFuzz: A New Solution for Binary-only Fuzzing StochFuzz is a (probabilistically) sound and cost-effective fuzzing technique for stripped binaries.

Zhuo Zhang 164 Dec 05, 2022
Official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers

Visual Parser (ViP) This is the official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers. Key Feature

Shuyang Sun 117 Dec 11, 2022