Spectral normalization (SN) is a widely-used technique for improving the stability and sample quality of Generative Adversarial Networks (GANs)

Overview

Why Spectral Normalization Stabilizes GANs: Analysis and Improvements

[paper (NeurIPS 2021)] [paper (arXiv)] [code]

Authors: Zinan Lin, Vyas Sekar, Giulia Fanti

Abstract: Spectral normalization (SN) is a widely-used technique for improving the stability and sample quality of Generative Adversarial Networks (GANs). However, there is currently limited understanding of why SN is effective. In this work, we show that SN controls two important failure modes of GAN training: exploding and vanishing gradients. Our proofs illustrate a (perhaps unintentional) connection with the successful LeCun initialization. This connection helps to explain why the most popular implementation of SN for GANs requires no hyper-parameter tuning, whereas stricter implementations of SN have poor empirical performance out-of-the-box. Unlike LeCun initialization which only controls gradient vanishing at the beginning of training, SN preserves this property throughout training. Building on this theoretical understanding, we propose a new spectral normalization technique: Bidirectional Scaled Spectral Normalization (BSSN), which incorporates insights from later improvements to LeCun initialization: Xavier initialization and Kaiming initialization. Theoretically, we show that BSSN gives better gradient control than SN. Empirically, we demonstrate that it outperforms SN in sample quality and training stability on several benchmark datasets.


This repo contains the codes for reproducing the experiments of our BSN and different SN variants in the paper. The codes were tested under Python 2.7.5, TensorFlow 1.14.0.

Preparing datasets

CIFAR10

Download cifar-10-python.tar.gz from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz (or from other sources).

STL10

Download stl10_binary.tar.gz from http://ai.stanford.edu/~acoates/stl10/stl10_binary.tar.gz (or from other sources), and put it in dataset_preprocess/STL10 folder. Then run python preprocess.py. This code will resize the images into 48x48x3 format, and save the images in stl10.npy.

CelebA

Download img_align_celeba.zip from https://www.kaggle.com/jessicali9530/celeba-dataset (or from other sources), and put it in dataset_preprocess/CelebA folder. Then run python preprocess.py. This code will crop and resize the images into 64x64x3 format, and save the images in celeba.npy.

ImageNet

Download ILSVRC2012_img_train.tar from http://www.image-net.org/ (or from other sources), and put it in dataset_preprocess/ImageNet folder. Then run python preprocess.py. This code will crop and resize the images into 128x128x3 format, and save the images in ILSVRC2012folder. Each subfolder in ILSVRC2012 folder corresponds to one class. Each npy file in the subfolders corresponds to an image.

Training BSN and SN variants

Prerequisites

The codes are based on GPUTaskScheduler library, which helps you automatically schedule the jobs among GPU nodes. Please install it first. You may need to change GPU configurations according to the devices you have. The configurations are set in config.py in each directory. Please refer to GPUTaskScheduler's GitHub page for the details of how to make proper configurations.

You can also run these codes without GPUTaskScheduler. Just run python gan.py in gan subfolders.

CIFAR10, STL10, CelebA

Preparation

Copy the preprocessed datasets from the previous steps into the following paths:

  • CIFAR10: /data/CIFAR10/cifar-10-python.tar.gz.
  • STL10: /data/STL10/cifar-10-stl10.npy.
  • CelebA: /data/CelebA/celeba.npy.

Here means

  • Vanilla SN and our proposed BSSN/SSN/BSN without gammas: no_gamma-CNN.
  • SN with the same gammas: same_gamma-CNN.
  • SN with different gammas: diff_gamma-CNN.

Alternatively, you can directly modify the dataset paths in /gan_task.py to the path of the preprocessed dataset folders.

Running codes

Now you can directly run python main.py in each to train the models.

All the configurable hyper-parameters can be set in config.py. The hyper-parameters in the file are already set for reproducing the results in the paper. Please refer to GPUTaskScheduler's GitHub page for the details of the grammar of this file.

ImageNet

Preparation

Copy the preprocessed folder ILSVRC2012 from the previous steps to /data/imagenet/ILSVRC2012, where means

  • Vanilla SN and our proposed BSSN/SSN/BSN without gammas: no_gamma-ResNet.

Alternatively, you can directly modify the dataset path in /gan_task.py to the path of the preprocessed folder ILSVRC2012.

Running codes

Now you can directly run python main.py in each to train the models.

All the configurable hyper-parameters can be set in config.py. The hyper-parameters in the file are already set for reproducing the results in the paper. Please refer to GPUTaskScheduler's GitHub page for the details of the grammar of this file.

The code supports multi-GPU training for speed-up, by separating each data batch equally among multiple GPUs. To do that, you only need to make minor modifications in config.py. For example, if you have two GPUs with IDs 0 and 1, then all you need to do is to (1) change "gpu": ["0"] to "gpu": [["0", "1"]], and (2) change "num_gpus": [1] to "num_gpus": [2]. Note that the number of GPUs might influence the results because in this implementation the batch normalization layers on different GPUs are independent. In our experiments, we were using only one GPU.

Results

The code generates the following result files/folders:

  • /results/ /worker.log : Standard output and error from the code.
  • /results/ /metrics.csv : Inception Score and FID during training.
  • /results/ /sample/*.png : Generated images during training.
  • /results/ /checkpoint/* : TensorFlow checkpoints.
  • /results/ /time.txt : Training iteration timestamps.
Owner
Zinan Lin
Ph.D. student at Electrical and Computer Engineering, Carnegie Mellon University
Zinan Lin
Unified file system operation experience for different backend

megfile - Megvii FILE library Docs: http://megvii-research.github.io/megfile megfile provides a silky operation experience with different backends (cu

MEGVII Research 76 Dec 14, 2022
Western-3DSlicer-Modules - Point-Set Registrations for Ultrasound Probe Calibrations

Point-Set Registrations for Ultrasound Probe Calibrations -Undergraduate Thesis-

Matteo Tanzi 0 May 04, 2022
Learning Off-Policy with Online Planning, CoRL 2021

LOOP: Learning Off-Policy with Online Planning Accepted in Conference of Robot Learning (CoRL) 2021. Harshit Sikchi, Wenxuan Zhou, David Held Paper In

Harshit Sikchi 24 Nov 22, 2022
Curvlearn, a Tensorflow based non-Euclidean deep learning framework.

English | 简体中文 Why Non-Euclidean Geometry Considering these simple graph structures shown below. Nodes with same color has 2-hop distance whereas 1-ho

Alibaba 123 Dec 12, 2022
This project deploys a yolo fastest model in the form of tflite on raspberry 3b+. The model is from another repository of mine called -Trash-Classification-Car

Deploy-yolo-fastest-tflite-on-raspberry 觉得有用的话可以顺手点个star嗷 这个项目将垃圾分类小车中的tflite模型移植到了树莓派3b+上面。 该项目主要是为了记录在树莓派部署yolo fastest tflite的流程 (之后有时间会尝试用C++部署来提升

7 Aug 16, 2022
[Preprint] "Chasing Sparsity in Vision Transformers: An End-to-End Exploration" by Tianlong Chen, Yu Cheng, Zhe Gan, Lu Yuan, Lei Zhang, Zhangyang Wang

Chasing Sparsity in Vision Transformers: An End-to-End Exploration Codes for [Preprint] Chasing Sparsity in Vision Transformers: An End-to-End Explora

VITA 64 Dec 08, 2022
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

TorchRL Disclaimer This library is not officially released yet and is subject to change. The features are available before an official release so that

Meta Research 860 Jan 07, 2023
Deep Multimodal Neural Architecture Search

MMNas: Deep Multimodal Neural Architecture Search This repository corresponds to the PyTorch implementation of the MMnas for visual question answering

Vision and Language Group@ MIL 23 Dec 21, 2022
RSC-Net: 3D Human Pose, Shape and Texture from Low-Resolution Images and Videos

RSC-Net: 3D Human Pose, Shape and Texture from Low-Resolution Images and Videos Implementation for "3D Human Pose, Shape and Texture from Low-Resoluti

XiangyuXu 42 Nov 10, 2022
SalGAN: Visual Saliency Prediction with Generative Adversarial Networks

SalGAN: Visual Saliency Prediction with Adversarial Networks Junting Pan Cristian Canton Ferrer Kevin McGuinness Noel O'Connor Jordi Torres Elisa Sayr

Image Processing Group - BarcelonaTECH - UPC 347 Nov 22, 2022
Official implementation of "GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators" (NeurIPS 2020)

GS-WGAN This repository contains the implementation for GS-WGAN: A Gradient-Sanitized Approach for Learning Differentially Private Generators (NeurIPS

46 Nov 09, 2022
Code of the paper "Deep Human Dynamics Prior" in ACM MM 2021.

Code of the paper "Deep Human Dynamics Prior" in ACM MM 2021. Figure 1: In the process of motion capture (mocap), some joints or even the whole human

Shinny cui 3 Oct 31, 2022
A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning

Officile code repository for "A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning"

Mathieu Godbout 1 Nov 19, 2021
Scientific Computation Methods in C and Python (Open for Hacktoberfest 2021)

Sci - cpy README is a stub. Do expand it. Objective This repository is meant to be a ready reference for scientific computation methods. Do ⭐ it if yo

Sandip Dutta 7 Oct 12, 2022
Spatial Transformer Nets in TensorFlow/ TensorLayer

MOVED TO HERE Spatial Transformer Networks Spatial Transformer Networks (STN) is a dynamic mechanism that produces transformations of input images (or

Hao 36 Nov 23, 2022
Minimalist Error collection Service compatible with Rollbar clients. Sentry or Rollbar alternative.

Minimalist Error collection Service Features Compatible with any Rollbar client(see https://docs.rollbar.com/docs). Just change the endpoint URL to yo

Haukur Rósinkranz 381 Nov 11, 2022
Codecov coverage standard for Python

Python-Standard Last Updated: 01/07/22 00:09:25 What is this? This is a Python application, with basic unit tests, for which coverage is uploaded to C

Codecov 10 Nov 04, 2022
The official implementation of EIGNN: Efficient Infinite-Depth Graph Neural Networks (NeurIPS 2021)

EIGNN: Efficient Infinite-Depth Graph Neural Networks The official implementation of EIGNN: Efficient Infinite-Depth Graph Neural Networks (NeurIPS 20

Juncheng Liu 14 Nov 22, 2022
A PyTorch-based library for semi-supervised learning

News If you want to join TorchSSL team, please e-mail Yidong Wang ([email protected]<

1k Jan 06, 2023
Learning Multiresolution Matrix Factorization and its Wavelet Networks on Graphs

Project Learning Multiresolution Matrix Factorization and its Wavelet Networks on Graphs, https://arxiv.org/pdf/2111.01940.pdf. Authors Truong Son Hy

5 Jun 28, 2022