StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

Overview


StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation. StudioGAN aims to offer an identical playground for modern GANs so that machine learning researchers can readily compare and analyze a new idea.

Features

  • Extensive GAN implementations for PyTorch
  • Comprehensive benchmark of GANs using CIFAR10, Tiny ImageNet, and ImageNet datasets
  • Better performance and lower memory consumption than original implementations
  • Providing pre-trained models that are fully compatible with up-to-date PyTorch environment
  • Support Multi-GPU (DP, DDP, and Multinode DistributedDataParallel), Mixed Precision, Synchronized Batch Normalization, LARS, Tensorboard Visualization, and other analysis methods

Implemented GANs

Name Venue Architecture G_type* D_type* Loss EMA**
DCGAN arXiv' 15 CNN/ResNet*** N/A N/A Vanilla False
LSGAN ICCV' 17 CNN/ResNet*** N/A N/A Least Sqaure False
GGAN arXiv' 17 CNN/ResNet*** N/A N/A Hinge False
WGAN-WC ICLR' 17 ResNet N/A N/A Wasserstein False
WGAN-GP NIPS' 17 ResNet N/A N/A Wasserstein False
WGAN-DRA arXiv' 17 ResNet N/A N/A Wasserstein False
ACGAN ICML' 17 ResNet cBN AC Hinge False
ProjGAN ICLR' 18 ResNet cBN PD Hinge False
SNGAN ICLR' 18 ResNet cBN PD Hinge False
SAGAN ICML' 19 ResNet cBN PD Hinge False
BigGAN ICLR' 18 Big ResNet cBN PD Hinge True
BigGAN-Deep ICLR' 18 Big ResNet Deep cBN PD Hinge True
CRGAN ICLR' 20 Big ResNet cBN PD/CL Hinge True
ICRGAN arXiv' 20 Big ResNet cBN PD/CL Hinge True
LOGAN arXiv' 19 Big ResNet cBN PD Hinge True
DiffAugGAN Neurips' 20 Big ResNet cBN PD/CL Hinge True
ADAGAN Neurips' 20 Big ResNet cBN PD/CL Hinge True
ContraGAN Neurips' 20 Big ResNet cBN CL Hinge True
FreezeD CVPRW' 20 - - - - -

*G/D_type indicates the way how we inject label information to the Generator or Discriminator. **EMA means applying an exponential moving average update to the generator. ***Experiments on Tiny ImageNet are conducted using the ResNet architecture instead of CNN.

cBN : conditional Batch Normalization. AC : Auxiliary Classifier. PD : Projection Discriminator. CL : Contrastive Learning.

To be Implemented

Name Venue Architecture G_type* D_type* Loss EMA**
StyleGAN2 CVPR' 20 StyleNet AdaIN - Vanilla True

AdaIN : Adaptive Instance Normalization.

Requirements

  • Anaconda
  • Python >= 3.6
  • 6.0.0 <= Pillow <= 7.0.0
  • scipy == 1.1.0 (Recommended for fast loading of Inception Network)
  • sklearn
  • seaborn
  • h5py
  • tqdm
  • torch >= 1.6.0 (Recommended for mixed precision training and knn analysis)
  • torchvision >= 0.7.0
  • tensorboard
  • 5.4.0 <= gcc <= 7.4.0 (Recommended for proper use of adaptive discriminator augmentation module)
  • torchlars (need to use LARS optimizer, can install by typing "pip install torchlars" in the command line)

You can install the recommended environment as follows:

conda env create -f environment.yml -n studiogan

With docker, you can use:

docker pull mgkang/studiogan:latest

This is my command to make a container named "studioGAN".

Also, you can use port number 6006 to connect the tensoreboard.

docker run -it --gpus all --shm-size 128g -p 6006:6006 --name studioGAN -v /home/USER:/root/code --workdir /root/code mgkang/studiogan:latest /bin/bash

Quick Start

  • Train (-t) and evaluate (-e) the model defined in CONFIG_PATH using GPU 0
CUDA_VISIBLE_DEVICES=0 python3 src/main.py -t -e -c CONFIG_PATH
  • Train (-t) and evaluate (-e) the model defined in CONFIG_PATH using GPUs (0, 1, 2, 3) and DataParallel
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 src/main.py -t -e -c CONFIG_PATH

Try python3 src/main.py to see available options.

Via Tensorboard, you can monitor trends of IS, FID, F_beta, Authenticity Accuracies, and the largest singular values:

~ PyTorch-StudioGAN/logs/RUN_NAME>>> tensorboard --logdir=./ --port PORT

Dataset

  • CIFAR10: StudioGAN will automatically download the dataset once you execute main.py.

  • Tiny Imagenet, Imagenet, or a custom dataset:

    1. download Tiny Imagenet and Imagenet. Prepare your own dataset.
    2. make the folder structure of the dataset as follows:
┌── docs
├── src
└── data
    └── ILSVRC2012 or TINY_ILSVRC2012 or CUSTOM
        ├── train
        │   ├── cls0
        │   │   ├── train0.png
        │   │   ├── train1.png
        │   │   └── ...
        │   ├── cls1
        │   └── ...
        └── valid
            ├── cls0
            │   ├── valid0.png
            │   ├── valid1.png
            │   └── ...
            ├── cls1
            └── ...

Supported Training Techniques

  • DistributedDataParallel (Please refer to Here)
    ### NODE_0, 4_GPUs, All ports are open to NODE_1
    docker run -it --gpus all --shm-size 128g --name studioGAN --network=host -v /home/USER:/root/code --workdir /root/code mgkang/studiogan:latest /bin/bash
    
    ~/code>>> export NCCL_SOCKET_IFNAME=^docker0,lo
    ~/code>>> export MASTER_ADDR=PUBLIC_IP_OF_NODE_0
    ~/code>>> export MASTER_PORT=AVAILABLE_PORT_OF_NODE_0
    
    ~/code/PyTorch-StudioGAN>>> CUDA_VISIBLE_DEVICES=0,1,2,3 python3 src/main.py -t -e -DDP -n 2 -nr 0 -c CONFIG_PATH
    ### NODE_1, 4_GPUs, All ports are open to NODE_0
    docker run -it --gpus all --shm-size 128g --name studioGAN --network=host -v /home/USER:/root/code --workdir /root/code mgkang/studiogan:latest /bin/bash
    
    ~/code>>> export NCCL_SOCKET_IFNAME=^docker0,lo
    ~/code>>> export MASTER_ADDR=PUBLIC_IP_OF_NODE_0
    ~/code>>> export MASTER_PORT=AVAILABLE_PORT_OF_NODE_0
    
    ~/code/PyTorch-StudioGAN>>> CUDA_VISIBLE_DEVICES=0,1,2,3 python3 src/main.py -t -e -DDP -n 2 -nr 1 -c CONFIG_PATH

※ StudioGAN does not support DDP training for ContraGAN. This is because conducting contrastive learning requires a 'gather' operation to calculate the exact conditional contrastive loss.

  • Mixed Precision Training (Narang et al.)
    CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -t -mpc -c CONFIG_PATH
  • Standing Statistics (Brock et al.)
    CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -e -std_stat --standing_step STANDING_STEP -c CONFIG_PATH
  • Synchronized BatchNorm
    CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -t -sync_bn -c CONFIG_PATH
  • Load All Data in Main Memory
    CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -t -l -c CONFIG_PATH
  • LARS
    CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -t -l -c CONFIG_PATH -LARS

To Visualize and Analyze Generated Images

The StudioGAN supports Image visualization, K-nearest neighbor analysis, Linear interpolation, and Frequency analysis. All results will be saved in ./figures/RUN_NAME/*.png.

  • Image Visualization
CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -iv -std_stat --standing_step STANDING_STEP -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH

  • K-Nearest Neighbor Analysis (we have fixed K=7, the images in the first column are generated images.)
CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -knn -std_stat --standing_step STANDING_STEP -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH

  • Linear Interpolation (applicable only to conditional Big ResNet models)
CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -itp -std_stat --standing_step STANDING_STEP -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH

  • Frequency Analysis
CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -fa -std_stat --standing_step STANDING_STEP -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH

  • TSNE Analysis
CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -tsne -std_stat --standing_step STANDING_STEP -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH

Metrics

Inception Score (IS)

Inception Score (IS) is a metric to measure how much GAN generates high-fidelity and diverse images. Calculating IS requires the pre-trained Inception-V3 network, and recent approaches utilize OpenAI's TensorFlow implementation.

To compute official IS, you have to make a "samples.npz" file using the command below:

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -s -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --log_output_path LOG_OUTPUT_PATH

It will automatically create the samples.npz file in the path ./samples/RUN_NAME/fake/npz/samples.npz. After that, execute TensorFlow official IS implementation. Note that we do not split a dataset into ten folds to calculate IS ten times. We use the entire dataset to compute IS only once, which is the evaluation strategy used in the CompareGAN repository.

CUDA_VISIBLE_DEVICES=0,...,N python3 src/inception_tf13.py --run_name RUN_NAME --type "fake"

Keep in mind that you need to have TensorFlow 1.3 or earlier version installed!

Note that StudioGAN logs Pytorch-based IS during the training.

Frechet Inception Distance (FID)

FID is a widely used metric to evaluate the performance of a GAN model. Calculating FID requires the pre-trained Inception-V3 network, and modern approaches use Tensorflow-based FID. StudioGAN utilizes the PyTorch-based FID to test GAN models in the same PyTorch environment. We show that the PyTorch based FID implementation provides almost the same results with the TensorFlow implementation (See Appendix F of our paper).

Precision and Recall (PR)

Precision measures how accurately the generator can learn the target distribution. Recall measures how completely the generator covers the target distribution. Like IS and FID, calculating Precision and Recall requires the pre-trained Inception-V3 model. StudioGAN uses the same hyperparameter settings with the original Precision and Recall implementation, and StudioGAN calculates the F-beta score suggested by Sajjadi et al.

Benchmark

※ We always welcome your contribution if you find any wrong implementation, bug, and misreported score.

We report the best IS, FID, and F_beta values of various GANs. B. S. means batch size for training.

CR, ICR, DiffAug, ADA, and LO refer to regularization or optimization techiniques: CR (Consistency Regularization), ICR (Improved Consistency Regularization), DiffAug (Differentiable Augmentation), ADA (Adaptive Discriminator Augmentation), and LO (Latent Optimization), respectively.

CIFAR10 (3x32x32)

When training, we used the command below.

CUDA_VISIBLE_DEVICES=0 python3 src/main.py -t -e -l -stat_otf -c CONFIG_PATH --eval_type "test"

With a single TITAN RTX GPU, training BigGAN takes about 13-15 hours.

Name B. S. IS(⭡) FID(⭣) F_1/8(⭡) F_8(⭡) Config Log Weights
DCGAN 64 6.638 49.030 0.833 0.795 Config Log Link
LSGAN 64 5.577 66.686 0.757 0.720 Config Log Link
GGAN 64 6.227 42.714 0.916 0.822 Config Log Link
WGAN-WC 64 2.579 159.090 0.190 0.199 Config Log Link
WGAN-GP 64 7.458 25.852 0.962 0.929 Config Log Link
WGAN-DRA 64 6.432 41.586 0.922 0.863 Config Log Link
ACGAN 64 6.629 45.571 0.857 0.847 Config Log Link
ProjGAN 64 7.539 33.830 0.952 0.855 Config Log Link
SNGAN 64 8.677 13.248 0.983 0.978 Config Log Link
SAGAN 64 8.680 14.009 0.982 0.970 Config Log Link
BigGAN 64 9.746 8.034 0.995 0.994 Config Log Link
BigGAN + CR 64 10.380 7.178 0.994 0.993 Config Log Link
BigGAN + ICR 64 10.153 7.430 0.994 0.993 Config Log Link
BigGAN + DiffAug 64 9.775 7.157 0.996 0.993 Config Log Link
BigGAN + ADA 64 10.136 7.881 0.993 0.994 Config Log Link
BigGAN + LO 64 9.701 8.369 0.992 0.989 Config Log Link
ContraGAN 64 9.729 8.065 0.993 0.992 Config Log Link
ContraGAN + CR 64 9.812 7.685 0.995 0.993 Config Log Link
ContraGAN + ICR 64 10.117 7.547 0.996 0.993 Config Log Link
ContraGAN + DiffAug 64 9.996 7.193 0.995 0.990 Config Log Link
ContraGAN + ADA 64 9.411 10.830 0.990 0.964 Config Log Link

※ IS, FID, and F_beta values are computed using 10K test and 10K generated Images.

※ When evaluating, the statistics of batch normalization layers are calculated on the fly (statistics of a batch).

CUDA_VISIBLE_DEVICES=0 python3 src/main.py -e -l -stat_otf -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --eval_type "test"

Tiny ImageNet (3x64x64)

When training, we used the command below.

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -t -e -l -stat_otf -c CONFIG_PATH --eval_type "valid"

With 4 TITAN RTX GPUs, training BigGAN takes about 2 days.

Name B. S. IS(⭡) FID(⭣) F_1/8(⭡) F_8(⭡) Config Log Weights
DCGAN 256 5.640 91.625 0.606 0.391 Config Log Link
LSGAN 256 5.381 90.008 0.638 0.390 Config Log Link
GGAN 256 5.146 102.094 0.503 0.307 Config Log Link
WGAN-WC 256 9.696 41.454 0.940 0.735 Config Log Link
WGAN-GP 256 1.322 311.805 0.016 0.000 Config Log Link
WGAN-DRA 256 9.564 40.655 0.938 0.724 Config Log Link
ACGAN 256 6.342 78.513 0.668 0.518 Config Log Link
ProjGAN 256 6.224 89.175 0.626 0.428 Config Log Link
SNGAN 256 8.412 53.590 0.900 0.703 Config Log Link
SAGAN 256 8.342 51.414 0.898 0.698 Config Log Link
BigGAN 1024 11.998 31.920 0.956 0.879 Config Log Link
BigGAN + CR 1024 14.887 21.488 0.969 0.936 Config Log Link
BigGAN + ICR 1024 5.605 91.326 0.525 0.399 Config Log Link
BigGAN + DiffAug 1024 17.075 16.338 0.979 0.971 Config Log Link
BigGAN + ADA 1024 15.158 24.121 0.953 0.942 Config Log Link
BigGAN + LO 256 6.964 70.660 0.857 0.621 Config Log Link
ContraGAN 1024 13.494 27.027 0.975 0.902 Config Log Link
ContraGAN + CR 1024 15.623 19.716 0.983 0.941 Config Log Link
ContraGAN + ICR 1024 15.830 21.940 0.980 0.944 Config Log Link
ContraGAN + DiffAug 1024 17.303 15.755 0.984 0.962 Config Log Link
ContraGAN + ADA 1024 8.398 55.025 0.878 0.677 Config Log Link

※ IS, FID, and F_beta values are computed using 50K validation and 50K generated Images.

※ When evaluating, the statistics of batch normalization layers are calculated on the fly (statistics of a batch).

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -e -l -stat_otf -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --eval_type "valid"

ImageNet (3x128x128)

When training, we used the command below.

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -t -e -l -sync_bn -stat_otf -c CONFIG_PATH --eval_type "valid"

With 8 TESLA V100 GPUs, training BigGAN2048 takes about a month.

Name B. S. IS(⭡) FID(⭣) F_1/8(⭡) F_8(⭡) Config Log Weights
SNGAN 256 32.247 26.792 0.938 0.913 Config Log Link
SAGAN 256 29.848 34.726 0.849 0.914 Config Log Link
BigGAN 256 28.633 24.684 0.941 0.921 Config Log Link
BigGAN 2048 99.705 7.893 0.985 0.989 Config Log Link
ContraGAN 256 25.249 25.161 0.947 0.855 Config Log Link

※ IS, FID, and F_beta values are computed using 50K validation and 50K generated Images.

※ When evaluating, the statistics of batch normalization layers are calculated in advance (moving average of the previous statistics).

CUDA_VISIBLE_DEVICES=0,...,N python3 src/main.py -e -l -sync_bn -c CONFIG_PATH --checkpoint_folder CHECKPOINT_FOLDER --eval_type "valid"

References

[1] Exponential Moving Average: https://github.com/ajbrock/BigGAN-PyTorch

[2] Synchronized BatchNorm: https://github.com/vacancy/Synchronized-BatchNorm-PyTorch

[3] Self-Attention module: https://github.com/voletiv/self-attention-GAN-pytorch

[4] Implementation Details: https://github.com/ajbrock/BigGAN-PyTorch

[5] Architecture Details: https://github.com/google/compare_gan

[6] DiffAugment: https://github.com/mit-han-lab/data-efficient-gans

[7] Adaptive Discriminator Augmentation: https://github.com/rosinality/stylegan2-pytorch

[8] Tensorflow IS: https://github.com/openai/improved-gan

[9] Tensorflow FID: https://github.com/bioinf-jku/TTUR

[10] Pytorch FID: https://github.com/mseitzer/pytorch-fid

[11] Tensorflow Precision and Recall: https://github.com/msmsajjadi/precision-recall-distributions

[12] torchlars: https://github.com/kakaobrain/torchlars

Citation

StudioGAN is established for the following research project. Please cite our work if you use StudioGAN.

@inproceedings{kang2020ContraGAN,
  title   = {{ContraGAN: Contrastive Learning for Conditional Image Generation}},
  author  = {Minguk Kang and Jaesik Park},
  journal = {Conference on Neural Information Processing Systems (NeurIPS)},
  year    = {2020}
}
Comments
  • Training stops after step 4000

    Training stops after step 4000

    Training always stops after this step:

    Visualize (num_rows x 8) fake image canvans. Save image canvas to output/figures/CUSTOM-ReACGAN-train-2022_03_23_14_40_18/generated_canvas_4000.png Start Evaluation (4000 Step): CUSTOM-ReACGAN-train-2022_03_23_14_40_18 generate images and stack features (36808 images).

    I'm training ReACGAN with my own dataset. However, the training stops always after this step and there are no logs to know why!

    Any help or idea would be highly appreciated!

    opened by festinais 17
  • confuse about result

    confuse about result

    Hi, thank you for your sharing the good repo for reimplement for GAN.

    I run the command : python main.py --eval -t -c "./configs/Table2/biggan32_cifar_hinge_no.json" the result is

    model=G-step=80000-Inception_mean=9.465-Inception_std=0.00551-FID=10.344.pth

    but when I use contraGAN with the same dataset

    python main.py --eval -t -c "./configs/Table2/contragan32_cifar_hinge_no.json"

    the result is:

    Current best model (FID) is ..... step=78000-Inception_mean=8.89-Inception_std=0.1667-FID=12.648

    when contraGAN is not better than biggan?

    opened by Johnson-yue 14
  • A circular import in utils/misc.py

    A circular import in utils/misc.py

    I encountered this problem when running the following command

    python ./src/main.py -t -metrics is fid prdc -cfg ./src/configs/CIFAR10/LGAN.yaml -data ./data/ -save ./result/

    Traceback (most recent call last): File "/root/gan/PyTorch-StudioGAN-master/src/main.py", line 19, in import config File "/root/gan/PyTorch-StudioGAN-master/src/config.py", line 16, in import utils.misc as misc File "/root/gan/PyTorch-StudioGAN-master/src/utils/misc.py", line 33, in import utils.sample as sample File "/root/gan/PyTorch-StudioGAN-master/src/utils/sample.py", line 21, in import utils.losses as losses File "/root/gan/PyTorch-StudioGAN-master/src/utils/losses.py", line 36, in import utils.misc as misc AttributeError: module 'utils' has no attribute 'misc'

    opened by vis-opt-group 12
  • truncation trick

    truncation trick

    Dear author,

    I am reading your implementation on latent sampling from sample.py (function: sample_latents). For a gaussian sampling as implemented by latents = torch.randn(batch_size, dim, device=device)/truncated_factor.

    I notice that the above implementation is not a standard truncation trick, which is defined by

    The Truncation Trick is a latent sampling procedure for generative adversarial networks, where we sample from a truncated normal (where values which fall outside a range are resampled to fall inside that range)

    opened by lihuiknight 11
  • ContraGAN no improvement at all

    ContraGAN no improvement at all

    I am not getting any improvement from contraGAN at all. For the main contribution, the FID improvement is not even 0.1. If i rerun FID score, I get worse score many times. Same for IS. other metrics not improving at all, as your table show. Including other method like diffaug is not fair as biggan can benefit from those method too.

    fid score can change a lot more than 0.1, sometimes more than 1 point if you run it many times. how many times do you train the model or get the fid score?

    for imagenet 128x128, ContraGAN score is even worse than bigGAN baseline. Since contragan is build on biggan, the result is no improvement or worse performance.

    I want to use your work but it is hard to believe the score now, can you help explain? thanks

    opened by curiousbyte19 11
  • question on distributeddataparellel(DDP)

    question on distributeddataparellel(DDP)

    Hi, I am testing DDP code with 4 V100 GPU as below,

    export MASTER_ADDR="localhost" export MASTER_PORT=2222 CUDA_VISIBLE_DEVICES=0,1,2,3 python3 src/main.py -t -metrics none -cfg CONFIG_PATH -data DATA_PATH -save SAVE_PATH -DDP -sync_bn -mpc

    but im having less than 48% gpu utilization. Could I get some help to improve this problem? image image

    below is what I have trained with 1* RTX 3090ti for a comparison. image

    opened by jakeyahn 10
  • What should std_max and std_step be set to reproduce the results?

    What should std_max and std_step be set to reproduce the results?

    Hi,

    Thank you for your great work in the GAN field. I would like to know what should std_max and std_step be set to reproduce the results in the ImageNet experiment of the ReACGAN paper.

    opened by liang-hou 8
  • Plan to add StyleGAN2 baseline for experiments

    Plan to add StyleGAN2 baseline for experiments

    Hi Minguk,

    Any plans for implementing the StyleGAN2 generator architecture in the library? Are there some issues which are making it hard to implement in this framework?

    Thanks for the great library.

    opened by rangwani-harsh 7
  • GPU util is not 100%

    GPU util is not 100%

    Hi, thank you for a great repository!

    I'm trying to train SNGAN on my custom dataset (128x128) in 4 GPU, but found that the GPU utils are ~90%. I also increased num_workers=16 and applied -rm_API but it is still not addressed. Do you have any suspicion on this issue?

    By the way, even BigGAN_256 raised out-of-memory (OOM) error for 4 GPUs of 12G memory. Do I need 8 GPU server or distributed training for BigGAN_256?

    Thank you for your help!

    p.s. typo: load_frameowrk -> load_framework

    opened by sangwoomo 7
  • How can I get the same result as you?

    How can I get the same result as you?

    Hi,

    I run the ContraGAN on CIFAR10 using the config file you provided and without any modification. However, I got a worse result with 10.435 FID while your result is 8.065. I wonder what's the different between us. In addition, you get a much better result than the figure reported in the paper(10.597±0.273), how did you do that?

    If anyone faces the same issue, please tell me. Thank you!

    opened by Linwei-Tao 6
  • Reproducing BigGAN256 results

    Reproducing BigGAN256 results

    Hi Mniguk,

    Recently, I want to reproduce results of ProjGAN on ImageNet with config file src/configs/ILSVRC2012/BigGAN256.json. I use the command

    python src/main.py -t -e -sync_bn -c src/configs/ILSVRC2012/BigGAN256.json --eval_type "valid"
    

    From logging file logs/IMAGENET/BigGAN256-train-2021_01_24_03_52_15.log the FID at iteration 44000 is 46.32, however, I get 43.87 when I run it. The difference is not negligible. I train BigGAN256 on 4 A100 gpus. I am wondering is this normal behavior? or is there anything I potentially did wrong?

    The only differences of my experiment setting from using the command in README file are:

    1. I use CenterCrop during training, but I think that would only make FID score worse.
    2. I did not use -l option to save loading time.
    3. -stat_otf flag is also off since it contradicts sync_bn.
    4. I wrapped valid folder in another one since ILSVRC2012's valid folder is not organized in subfolders and causes problem in ImageFolder.

    Thanks a lot in advance!

    opened by phymhan 6
  • Fix CAS computation

    Fix CAS computation

    • missing optimizer.zero_grad()
    • validation accuracy was computed on the training set instead of the validation set
    • fix log

    Also, I believe would be better to generate a fake dataset only once and use it to train the classifier for N epochs. So, the correct pipeline would be:

    1. Generate a fake dataset of the same length (and same number of images per class, preferably) of the original training dataset (e.g: CIFAR10 -> 5000 images per class)
    2. Shuffle the dataset
    3. Train the classifier for N epochs and at the end of each epoch compute the accuracy on the original validation dataset
    opened by sup3rgiu 0
  • Gradient penalty

    Gradient penalty "interpolates" term

    Hi,

    This is 'cal_grad_penalty' function in /src/utils/losses.py

    def cal_grad_penalty(real_images, real_labels, fake_images, discriminator, device):
        batch_size, c, h, w = real_images.shape
        alpha = torch.rand(batch_size, 1)
        alpha = alpha.expand(batch_size, real_images.nelement() // batch_size).contiguous().view(batch_size, c, h, w)
        alpha = alpha.to(device)
    
        real_images = real_images.to(device)
        interpolates = alpha * real_images + ((1 - alpha) * fake_images)
        interpolates = interpolates.to(device)
        interpolates = autograd.Variable(interpolates, requires_grad=True)
        fake_dict = discriminator(interpolates, real_labels, eval=False)
        grads = cal_deriv(inputs=interpolates, outputs=fake_dict["adv_output"], device=device)
        grads = grads.view(grads.size(0), -1)
    
        grad_penalty = ((grads.norm(2, dim=1) - 1)**2).mean() + interpolates[:,0,0,0].mean()*0
        return grad_penalty
    

    In the last line, grad_penalty = ((grads.norm(2, dim=1) - 1)**2).mean() + interpolates[:,0,0,0].mean()*0, I wanted to know what additive term + interpolates[:,0,0,0].mean()*0 means. Since it's zero-multiplicated, I think it has actually no effect for code.

    I'll be waiting for your answer

    Thank you!

    opened by rl-max 0
  • G/D Losses go NaN (Not A Number) in WGAN-GP training

    G/D Losses go NaN (Not A Number) in WGAN-GP training

    Hi,

    I discovered that, in WGAN-GP for CIFAR10, Discriminator and Generator loss always display NaN (Not A Number) after some time elapsed. The problem existed when I ran src/configs/CIFAR10/WGAN-GP.yaml file but I did not check if this also happens in other training data.

    +) The one thing I consider interesting is that this always happens (I think) exactly at 1500 steps since the start of the training.

    ==================== This is the code I ran =======================

    !python src/main.py -t -hdf5 -l -metrics is fid prdc -ref test --num_workers 2 --save_freq 5000 \
        -cfg src/configs/CIFAR10/WGAN-GP.yaml \
        -data ./cifar10 -save . -mpc --post_resizer friendly --eval_backbone InceptionV3_tf
    

    ==================== This is the log I got ====================

     [INFO] 2022-11-02 23:50:13 > Step:    900 Progress: 0.9% Elapsed: 0:07:59 Gen_loss: -2.982 Dis_loss: -1.749 
     [INFO] 2022-11-02 23:51:06 > Step:   1000 Progress: 1.0% Elapsed: 0:08:52 Gen_loss: 12.56 Dis_loss: -1.571 
     [INFO] 2022-11-02 23:52:00 > Step:   1100 Progress: 1.1% Elapsed: 0:09:46 Gen_loss: 6.941 Dis_loss: -1.659 
     [INFO] 2022-11-02 23:52:53 > Step:   1200 Progress: 1.2% Elapsed: 0:10:38 Gen_loss: 6.988 Dis_loss: -3.02 
     [INFO] 2022-11-02 23:53:46 > Step:   1300 Progress: 1.3% Elapsed: 0:11:32 Gen_loss: 9.656 Dis_loss: -1.676 
     [INFO] 2022-11-02 23:54:38 > Step:   1400 Progress: 1.4% Elapsed: 0:12:24 Gen_loss: 4.777 Dis_loss: -1.606 
     [INFO] 2022-11-02 23:55:29 > Step:   1500 Progress: 1.5% Elapsed: 0:13:14 Gen_loss: nan Dis_loss: nan 
     [INFO] 2022-11-02 23:56:17 > Step:   1600 Progress: 1.6% Elapsed: 0:14:03 Gen_loss: nan Dis_loss: nan 
     [INFO] 2022-11-02 23:57:05 > Step:   1700 Progress: 1.7% Elapsed: 0:14:51 Gen_loss: nan Dis_loss: nan 
     [INFO] 2022-11-02 23:57:54 > Step:   1800 Progress: 1.8% Elapsed: 0:15:39 Gen_loss: nan Dis_loss: nan 
     [INFO] 2022-11-02 23:58:42 > Step:   1900 Progress: 1.9% Elapsed: 0:16:28 Gen_loss: nan Dis_loss: nan 
     [INFO] 2022-11-02 23:59:31 > Step:   2000 Progress: 2.0% Elapsed: 0:17:17 Gen_loss: nan Dis_loss: nan
    
    opened by rl-max 0
  • Training on custom data does not balance out

    Training on custom data does not balance out

    Hello,

    thanks for this repo and all of the implementations. I have been trying to get the Implementation to work on some custom data, however, the loss curves always indicate that something is going wrong. The data I am training on is 256 x 256 and when training on it the discriminator keeps going to values very close to 0 and eventually either goes to zero or comes back for a few training steps and then repeats the same procedure.

    Do you have any tips or suggestions on which parameters to change so the training runs more stable with a custom dataset?

    Thanks

    opened by mehdiosa 1
  • list index out of range(in utils.ckpt. load_StudioGAN_ckpts())

    list index out of range(in utils.ckpt. load_StudioGAN_ckpts())

    hi,

    I found glob. glob (in utils.ckpt. load_StudioGAN_ckpts())returns an empty list, causing list index out of range to appear.

    The reason is that the folder name contains special character '=', so you need to run glob.escape() to get the file in the path normally.

    I used

    x = join(ckpt_dir, "model=G-{when}-weights-step=".format(when=when))
    Gen_ckpt_path = glob.glob(glob.escape(x) + '*.pth')[0]
        
    y = join(ckpt_dir, "model=D-{when}-weights-step=".format(when=when))
    Dis_ckpt_path = glob.glob(glob.escape(y) + '*.pth')[0]
    

    instead of

    Gen_ckpt_path = glob.glob(join(ckpt_dir, "model=G-{when}-weights-step*.pth".format(when=when)))[0]
    
    Dis_ckpt_path = glob.glob(join(ckpt_dir, "model=D-{when}-weights-step*.pth".format(when=when)))[0]
    
    opened by goongzi-leean 1
Releases(v.0.4.0)
  • v.0.4.0(Jul 5, 2022)

    • We checked the reproducibility of implemented GANs.
    • We provide Baby, Papa, and Grandpa ImageNet datasets where images are processed using the anti-aliasing and high-quality resizer.
    • StudioGAN provides a dedicatedly established Benchmark on standard datasets (CIFAR10, ImageNet, AFHQv2, and FFHQ).
    • StudioGAN supports InceptionV3, ResNet50, SwAV, DINO, and Swin Transformer backbones for GAN evaluation.
    Source code(tar.gz)
    Source code(zip)
  • v.0.3.0(Nov 5, 2021)

    • Add SOTA GANs: LGAN, TACGAN, StyleGAN2, MDGAN, MHGAN, ADCGAN, ReACGAN (our new paper).
    • Add five types of differentiable augmentation: CR, DiffAugment, ADA, SimCLR, BYOL.
    • Implement useful regularizations: Top-K training, Feature Matching, R1-Regularization, MaxGP
    • Add Improved Precision & Recall, Density & Coverage, iFID, and CAS for reliable evaluation.
    • Support Inception_V3 and SwAV backbones for GAN evaluation.
    • Verify the reproducibility of StyleGAN2 and BigGAN.
    • Fix bugs in FreezeD, DDP training, Mixed Precision training, and ADA.
    • Support Discriminator Driven Latent Sampling, Semantic Factorization for BigGAN evaluation.
    • Support Wandb logging instead of Tensorboard.
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Feb 23, 2021)

    Second release of StudioGAN with following features

    • Fix minor bugs (slow convergence of training GAN + ADA models, tracking bn statistics during evaluation, etc.)
    • Add multi-node DistributedDataParallel (DDP) training.
    • Comprehensive benchmarks on CIFAR10, Tiny_ImageNet, and ImageNet datasets.
    • Provide pre-trained models and log files for the future research.
    • Add LARS optimizer and TSNE analysis.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Dec 7, 2020)

    First StudioGAN release with following features

    • Extensive GAN implementations for Pytorch: From DCGAN to ADAGAN
    • Comprehensive benchmark of GANs using CIFAR10 dataset
    • Better performance and lower memory consumption than original implementations
    • Providing pre-trained models that are fully compatible with up-to-date PyTorch environment
    • Support Multi-GPU(both DP and DDP), Mixed precision, Synchronized Batch Normalization, and Tensorboard Visualization
    Source code(tar.gz)
    Source code(zip)
Back to Event Basics: SSL of Image Reconstruction for Event Cameras

Back to Event Basics: SSL of Image Reconstruction for Event Cameras Minimal code for Back to Event Basics: Self-Supervised Learning of Image Reconstru

TU Delft 42 Dec 26, 2022
CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

selfcontact This repo is part of our project: On Self-Contact and Human Pose. [Project Page] [Paper] [MPI Project Page] It includes the main function

Lea Müller 68 Dec 06, 2022
Code accompanying the NeurIPS 2021 paper "Generating High-Quality Explanations for Navigation in Partially-Revealed Environments"

Generating High-Quality Explanations for Navigation in Partially-Revealed Environments This work presents an approach to explainable navigation under

RAIL Group @ George Mason University 1 Oct 28, 2022
A lightweight library designed to accelerate the process of training PyTorch models by providing a minimal

A lightweight library designed to accelerate the process of training PyTorch models by providing a minimal, but extensible training loop which is flexible enough to handle the majority of use cases,

Chris Hughes 110 Dec 23, 2022
CVPR 2021: "The Spatially-Correlative Loss for Various Image Translation Tasks"

Spatially-Correlative Loss arXiv | website We provide the Pytorch implementation of "The Spatially-Correlative Loss for Various Image Translation Task

Chuanxia Zheng 89 Jan 04, 2023
[CVPR 2022] PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision (Oral)

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision Kehong Gong*, Bingbing Li*, Jianfeng Zhang*, Ta

256 Dec 28, 2022
This is the code related to "Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation" (ICCV 2021).

Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation This is the code relat

39 Sep 23, 2022
Some bravo or inspiring research works on the topic of curriculum learning.

Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN Official code for NeurIPS 2021 paper "Towards Scalable Unpaired Virtu

131 Jan 07, 2023
Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

TCMR: Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video Qualtitative result Paper teaser video Introduction This r

Hongsuk Choi 215 Jan 06, 2023
Neon-erc20-example - Example of creating SPL token and wrapping it with ERC20 interface in Neon EVM

Example of wrapping SPL token by ERC2-20 interface in Neon Requirements Install

7 Mar 28, 2022
Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation

Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation Woncheol Shin1, Gyubok Lee1, Jiyoung Lee1, Joonseok Lee2,3, Edward Ch

Woncheol Shin 7 Sep 26, 2022
Object detection and instance segmentation toolkit based on PaddlePaddle.

Object detection and instance segmentation toolkit based on PaddlePaddle.

9.3k Jan 02, 2023
NeuTex: Neural Texture Mapping for Volumetric Neural Rendering

NeuTex: Neural Texture Mapping for Volumetric Neural Rendering Paper: https://arxiv.org/abs/2103.00762 Running Run on the provided DTU scene cd run ba

Fanbo Xiang 67 Dec 28, 2022
Implementation of "A MLP-like Architecture for Dense Prediction"

A MLP-like Architecture for Dense Prediction (arXiv) Updates (22/07/2021) Initial release. Model Zoo We provide CycleMLP models pretrained on ImageNet

Shoufa Chen 244 Dec 27, 2022
Least Square Calibration for Peer Reviews

Least Square Calibration for Peer Reviews Requirements gurobipy - for solving convex programs GPy - for Bayesian baseline numpy pandas To generate p

Sigma <a href=[email protected]"> 1 Nov 01, 2021
Code for the submitted paper Surrogate-based cross-correlation for particle image velocimetry

Surrogate-based cross-correlation (SBCC) This repository contains code for the submitted paper Surrogate-based cross-correlation for particle image ve

5 Jun 30, 2022
Politecnico of Turin Thesis: "Implementation and Evaluation of an Educational Chatbot based on NLP Techniques"

THESIS_CAIRONE_FIORENTINO Politecnico of Turin Thesis: "Implementation and Evaluation of an Educational Chatbot based on NLP Techniques" GENERATE TOKE

cairone_fiorentino97 1 Dec 10, 2021
An automated facial recognition based attendance system (desktop application)

Facial_Recognition_based_Attendance_System An automated facial recognition based attendance system (desktop application) Made using Python, Tkinter an

1 Jun 21, 2022
Official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.

GLIDE This is the official codebase for running the small, filtered-data GLIDE model from GLIDE: Towards Photorealistic Image Generation and Editing w

OpenAI 2.9k Jan 04, 2023
Improving Convolutional Networks via Attention Transfer (ICLR 2017)

Attention Transfer PyTorch code for "Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Tran

Sergey Zagoruyko 1.4k Dec 23, 2022