This is a re-implementation of TransGAN: Two Pure Transformers Can Make One Strong GAN (CVPR 2021) in PyTorch.

Last update: Jan 05, 2023

Related tags

Overview

TransGAN: Two Transformers Can Make One Strong GAN [YouTube Video]

Paper Authors: Yifan Jiang, Shiyu Chang, Zhangyang Wang

CVPR 2021

This is re-implementation of TransGAN: Two Transformers Can Make One Strong GAN, and That Can Scale Up, CVPR 2021 in PyTorch.

Generative Adversarial Networks-GAN builded completely free of Convolutions and used Transformers architectures which became popular since Vision Transformers-ViT. In this implementation, CIFAR-10 dataset was used.

0 Epoch	40 Epoch	100 Epoch	200 Epoch

Related Work - Vision Transformers (ViT)

In this implementation, as a discriminator, Vision Transformer(ViT) Block was used. In order to get more info about ViT, you can look at the original paper here

Credits for illustration of ViT: @lucidrains

Installation

Before running train.py, check whether you have libraries in requirements.txt! Also, create ./fid_stat folder and download the fid_stats_cifar10_train.npz file in this folder. To save your model during training, create ./checkpoint folder using mkdir checkpoint.

Training

python train.py

Pretrained Model

You can find pretrained model here. You can download using:

wget https://drive.google.com/file/d/134GJRMxXFEaZA0dF-aPpDS84YjjeXPdE/view

curl gdrive.sh | bash -s https://drive.google.com/file/d/134GJRMxXFEaZA0dF-aPpDS84YjjeXPdE/view

License

MIT

Citation

@article{jiang2021transgan,
  title={TransGAN: Two Transformers Can Make One Strong GAN},
  author={Jiang, Yifan and Chang, Shiyu and Wang, Zhangyang},
  journal={arXiv preprint arXiv:2102.07074},
  year={2021}
}

@article{dosovitskiy2020,
  title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
  author={Dosovitskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and  Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and Uszkoreit, Jakob and Houlsby, Neil},
  journal={arXiv preprint arXiv:2010.11929},
  year={2020}
}

@inproceedings{zhao2020diffaugment,
  title={Differentiable Augmentation for Data-Efficient GAN Training},
  author={Zhao, Shengyu and Liu, Zhijian and Lin, Ji and Zhu, Jun-Yan and Han, Song},
  booktitle={Conference on Neural Information Processing Systems (NeurIPS)},
  year={2020}
}

Comments

GPU memory, Modifying batch size
Hello,

I saw your comment in VITA-Group's implementation of TransGAN and started looking at your implementation here.

Without modifying anything and attempting to run "python train.py" results in CUDA out of memory; I believe the GPU I'm using cannot handle the model size/training images that you've specified. I tried editing the batch size on lines 35 and 36 of train.py (--gener_batch_size, changing default from 64 to 32, etc.), but I get a RuntimeError of:

Output 0 of UnbindBackward is a view and is being modified inplace. This view is the output of a function that returns multiple views. Such fuctions do not allow the otutput views to be modified inplace. You should replace the inplace operation by an out-of-place one.

My two questions are:

How would you suggest modifying the training parameters to deal with GPU running out of memory? and,

Is there a better way to edit the batch size, and what else do I need to change in order for the code to not break when the batch size is changed?

Thanks!
opened by Andrew-X-Wang 10
Create your own FID stats file

Hello and thanks for the implementation. I'm trying to train this model on a different datset, but to do so I need a custom fid_stats file for my dataset. How can I create it ?

opened by IlyasMoutawwakil 2
FID score: nan

Thank you for your contribution. But in the training processing, FID score is Nan. I want to known whether it is appropriate. Should I make some chance to solve this problem?

opened by Jamie-Cheung 1
TransGAN fid problem

hello,I would like to humbly ask you what is the difference beetween TransGAN-main and TransGAN-master?can Trans-main reproduce similar results of the original paper? The results obtained by using CIFAR in TransGAN-main are quite different from those in the paper,and WGAN-EP loss concussion,so I want to ask you.

opened by Stephenlove 1
How do you test on your own dataset with the checkpoint.pth generated?

I want to use the checkpoint saved to generate my own results from a testing dataset and use those images later to calculate my own evaluation metrics. Please help

opened by meh-naz 0

Releases(v2.0)

v2.0(Jul 6, 2021)

More qualified generated images with TransGAN on CIFAR10 dataset.
Source code(tar.gz)
Source code(zip)
v1.0(May 31, 2021)

In this version of re-implementation, MNIST and CIFAR-10 datasets were used for TransGAN-S.
Source code(tar.gz)
Source code(zip)

Owner

Ahmet Sarigun

Yet, another human being!

GitHub Repository https://arxiv.org/abs/2102.07074

Build fully-functioning computer vision models with PyTorch

Detecto is a Python package that allows you to build fully-functioning computer vision and object detection models with just 5 lines of code. Inferenc

576 Dec 29, 2022

MPI-IS Mesh Processing Library

Perceiving Systems Mesh Package This package contains core functions for manipulating meshes and visualizing them. It requires Python 3.5+ and is supp

494 Jan 06, 2023

Nvdiffrast - Modular Primitives for High-Performance Differentiable Rendering

Nvdiffrast – Modular Primitives for High-Performance Differentiable Rendering Modular Primitives for High-Performance Differentiable Rendering Samuli

675 Jan 06, 2023

Tutorials and implementations for "Self-normalizing networks"

Self-Normalizing Networks Tutorials and implementations for "Self-normalizing networks"(SNNs) as suggested by Klambauer et al. (arXiv pre-print). Vers

1.6k Jan 07, 2023

Emulation and Feedback Fuzzing of Firmware with Memory Sanitization

BaseSAFE This repository contains the BaseSAFE Rust APIs, introduced by "BaseSAFE: Baseband SAnitized Fuzzing through Emulation". The example/ directo

138 Dec 16, 2022

CARL provides highly configurable contextual extensions to several well-known RL environments.

CARL (context adaptive RL) provides highly configurable contextual extensions to several well-known RL environments.

51 Dec 28, 2022

TensorFlow GNN is a library to build Graph Neural Networks on the TensorFlow platform.

TensorFlow GNN This is an early (alpha) release to get community feedback. It's under active development and we may break API compatibility in the fut

889 Dec 30, 2022

Official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence".

The DETR approach applies the transformer encoder and decoder architecture to object detection and achieves promising performance. In this paper, we handle the critical issue, slow training convergen

281 Dec 30, 2022

Fast, differentiable sorting and ranking in PyTorch

Torchsort Fast, differentiable sorting and ranking in PyTorch. Pure PyTorch implementation of Fast Differentiable Sorting and Ranking (Blondel et al.)

655 Jan 04, 2023

Dynamic Graph Event Detection

DyGED Dynamic Graph Event Detection Get Started pip install -r requirements.txt TODO Paper link to arxiv, and how to cite. Twitter Weather dataset tra

3 May 09, 2022

Accelerate Neural Net Training by Progressively Freezing Layers

FreezeOut A simple technique to accelerate neural net training by progressively freezing layers. This repository contains code for the extended abstra

203 Jun 19, 2022

PyTorch implementation of a Real-ESRGAN model trained on custom dataset

Real-ESRGAN PyTorch implementation of a Real-ESRGAN model trained on custom dataset. This model shows better results on faces compared to the original

160 Jan 04, 2023

Task Transformer Network for Joint MRI Reconstruction and Super-Resolution (MICCAI 2021)

T2Net Task Transformer Network for Joint MRI Reconstruction and Super-Resolution (MICCAI 2021) [Paper][Code] Dependencies numpy==1.18.5 scikit_image==

64 Nov 23, 2022

An implementation for Neural Architecture Search with Random Labels (CVPR 2021 poster) on Pytorch.

Neural Architecture Search with Random Labels(RLNAS) Introduction This project provides an implementation for Neural Architecture Search with Random L

18 Nov 08, 2022

Repository for publicly available deep learning models developed in Rosetta community

trRosetta2 This package contains deep learning models and related scripts used by Baker group in CASP14. Installation Linux/Mac clone the package git

81 Dec 29, 2022

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Multi-Task Framework for Cross-Lingual Abstractive Summarization (MCLAS) The code for ACL2021 paper Cross-Lingual Abstractive Summarization with Limit

43 Nov 07, 2022

Implementation of various Vision Transformers I found interesting

78 Dec 06, 2022

Repo for "Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions" https://arxiv.org/abs/2201.12296

Benchmarking Robustness of 3D Point Cloud Recognition against Common Corruptions This repo contains the dataset and code for the paper Benchmarking Ro

168 Dec 29, 2022

Spherical Confidence Learning for Face Recognition, accepted to CVPR2021.

Sphere Confidence Face (SCF) This repository contains the PyTorch implementation of Sphere Confidence Face (SCF) proposed in the CVPR2021 paper: Shen

70 Dec 09, 2022

This is the official github repository of the Met dataset

The Met dataset This is the official github repository of the Met dataset. The official webpage of the dataset can be found here. What is it? This cod

35 Dec 17, 2022