StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

Related tags

Deep LearningStackGAN
Overview

StackGAN

Tensorflow implementation for reproducing main results in the paper StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks by Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas.

Dependencies

python 2.7

TensorFlow 0.12

[Optional] Torch is needed, if use the pre-trained char-CNN-RNN text encoder.

[Optional] skip-thought is needed, if use the skip-thought text encoder.

In addition, please add the project folder to PYTHONPATH and pip install the following packages:

  • prettytensor
  • progressbar
  • python-dateutil
  • easydict
  • pandas
  • torchfile

Data

  1. Download our preprocessed char-CNN-RNN text embeddings for birds and flowers and save them to Data/.
  • [Optional] Follow the instructions reedscot/icml2016 to download the pretrained char-CNN-RNN text encoders and extract text embeddings.
  1. Download the birds and flowers image data. Extract them to Data/birds/ and Data/flowers/, respectively.
  2. Preprocess images.
  • For birds: python misc/preprocess_birds.py
  • For flowers: python misc/preprocess_flowers.py

Training

  • The steps to train a StackGAN model on the CUB dataset using our preprocessed data for birds.
    • Step 1: train Stage-I GAN (e.g., for 600 epochs) python stageI/run_exp.py --cfg stageI/cfg/birds.yml --gpu 0
    • Step 2: train Stage-II GAN (e.g., for another 600 epochs) python stageII/run_exp.py --cfg stageII/cfg/birds.yml --gpu 1
  • Change birds.yml to flowers.yml to train a StackGAN model on Oxford-102 dataset using our preprocessed data for flowers.
  • *.yml files are example configuration files for training/testing our models.
  • If you want to try your own datasets, here are some good tips about how to train GAN. Also, we encourage to try different hyper-parameters and architectures, especially for more complex datasets.

Pretrained Model

  • StackGAN for birds trained from char-CNN-RNN text embeddings. Download and save it to models/.
  • StackGAN for flowers trained from char-CNN-RNN text embeddings. Download and save it to models/.
  • StackGAN for birds trained from skip-thought text embeddings. Download and save it to models/ (Just used the same setting as the char-CNN-RNN. We assume better results can be achieved by playing with the hyper-parameters).

Run Demos

  • Run sh demo/flowers_demo.sh to generate flower samples from sentences. The results will be saved to Data/flowers/example_captions/. (Need to download the char-CNN-RNN text encoder for flowers to models/text_encoder/. Note: this text encoder is provided by reedscot/icml2016).
  • Run sh demo/birds_demo.sh to generate bird samples from sentences. The results will be saved to Data/birds/example_captions/.(Need to download the char-CNN-RNN text encoder for birds to models/text_encoder/. Note: this text encoder is provided by reedscot/icml2016).
  • Run python demo/birds_skip_thought_demo.py --cfg demo/cfg/birds-skip-thought-demo.yml --gpu 2 to generate bird samples from sentences. The results will be saved to Data/birds/example_captions-skip-thought/. (Need to download vocabulary for skip-thought vectors to Data/skipthoughts/).

Examples for birds (char-CNN-RNN embeddings), more on youtube:

Examples for flowers (char-CNN-RNN embeddings), more on youtube:

Save your favorite pictures generated by our models since the randomness from noise z and conditioning augmentation makes them creative enough to generate objects with different poses and viewpoints from the same discription 😃

Citing StackGAN

If you find StackGAN useful in your research, please consider citing:

@inproceedings{han2017stackgan,
Author = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
Title = {StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks},
Year = {2017},
booktitle = {{ICCV}},
}

Our follow-up work

References

  • Generative Adversarial Text-to-Image Synthesis Paper Code
  • Learning Deep Representations of Fine-grained Visual Descriptions Paper Code
Owner
Han Zhang
Han Zhang
Jupyter notebooks showing best practices for using cx_Oracle, the Python DB API for Oracle Database

Python cx_Oracle Notebooks, 2022 The repository contains Jupyter notebooks showing best practices for using cx_Oracle, the Python DB API for Oracle Da

Christopher Jones 13 Dec 15, 2022
Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.

Softlearning Softlearning is a deep reinforcement learning toolbox for training maximum entropy policies in continuous domains. The implementation is

Robotic AI & Learning Lab Berkeley 997 Dec 30, 2022
Intrusion Detection System using ensemble learning (machine learning)

IDS-ML implementation of an intrusion detection system using ensemble machine learning methods Data set This project is carried out using the UNSW-15

4 Nov 25, 2022
Python library containing BART query generation and BERT-based Siamese models for neural retrieval.

Neural Retrieval Embedding-based Zero-shot Retrieval through Query Generation leverages query synthesis over large corpuses of unlabeled text (such as

Amazon Web Services - Labs 35 Apr 14, 2022
Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"

FLASH - Pytorch Implementation of the Transformer variant proposed in the paper Transformer Quality in Linear Time Install $ pip install FLASH-pytorch

Phil Wang 209 Dec 28, 2022
render sprites into your desktop environment as shaped windows using GTK

spritegtk render static or animated sprites into your desktop environment as dynamic shaped windows using GTK requires pycairo and PYGobject: pip inst

hermit 20 Oct 27, 2022
Conversational text Analysis using various NLP techniques

PyConverse Let me try first Installation pip install pyconverse Usage Please try this notebook that demos the core functionalities: basic usage noteb

Rita Anjana 158 Dec 25, 2022
WebUAV-3M: A Benchmark Unveiling the Power of Million-Scale Deep UAV Tracking

WebUAV-3M: A Benchmark Unveiling the Power of Million-Scale Deep UAV Tracking [Paper Link] Abstract In this work, we contribute a new million-scale Un

25 Jan 01, 2023
Unofficial keras(tensorflow) implementation of MAE model from Masked Autoencoders Are Scalable Vision Learners

MAE-keras Unofficial keras(tensorflow) implementation of MAE model described in 'Masked Autoencoders Are Scalable Vision Learners'. This work has been

Yewon 11 Jun 12, 2022
Data Augmentation with Variational Autoencoders

Documentation Pyraug This library provides a way to perform Data Augmentation using Variational Autoencoders in a reliable way even in challenging con

112 Nov 30, 2022
Cancer metastasis detection with neural conditional random field (NCRF)

NCRF Prerequisites Data Whole slide images Annotations Patch images Model Training Testing Tissue mask Probability map Tumor localization FROC evaluat

Baidu Research 731 Jan 01, 2023
PyTorch code for the paper: FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning

FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning This is the PyTorch implementation of our paper: FeatMatch: Feature-Based Augmentat

43 Nov 19, 2022
Official PyTorch implementation of BlobGAN: Spatially Disentangled Scene Representations

BlobGAN: Spatially Disentangled Scene Representations Official PyTorch Implementation Paper | Project Page | Video | Interactive Demo BlobGAN.mp4 This

148 Dec 29, 2022
Download from Onlyfans.com.

OnlySave: Onlyfans downloader Getting Started: Download the setup executable from the latest release. Install and run. Only works on Windows currently

4 May 30, 2022
A two-stage U-Net for high-fidelity denoising of historical recordings

A two-stage U-Net for high-fidelity denoising of historical recordings Official repository of the paper (not submitted yet): E. Moliner and V. Välimäk

Eloi Moliner Juanpere 57 Jan 05, 2023
Python binding for Khiva library.

Khiva-Python Build Documentation Build Linux and Mac OS Build Windows Code Coverage README This is the Khiva Python binding, it allows the usage of Kh

Shapelets 46 Oct 16, 2022
Network Enhancement implementation in pytorch

network_enahncement_pytorch Network Enhancement implementation in pytorch Research paper Network Enhancement: a general method to denoise weighted bio

Yen 1 Nov 12, 2021
RoadMap and preparation material for Machine Learning and Data Science - From beginner to expert.

ML-and-DataScience-preparation This repository has the goal to create a learning and preparation roadMap for Machine Learning Engineers and Data Scien

33 Dec 29, 2022
ML-Decoder: Scalable and Versatile Classification Head

ML-Decoder: Scalable and Versatile Classification Head Paper Official PyTorch Implementation Tal Ridnik, Gilad Sharir, Avi Ben-Cohen, Emanuel Ben-Baru

189 Jan 04, 2023
This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

Towards Persona-Based Empathetic Conversational Models (PEC) This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (E

Zhong Peixiang 35 Nov 17, 2022