Generative Adversarial Text to Image Synthesis

Overview

Text To Image Synthesis

This is a tensorflow implementation of synthesizing images. The images are synthesized using the GAN-CLS Algorithm from the paper Generative Adversarial Text-to-Image Synthesis. This implementation is built on top of the excellent DCGAN in Tensorflow.

Plese star https://github.com/tensorlayer/tensorlayer

Model architecture

Image Source : Generative Adversarial Text-to-Image Synthesis Paper

Requirements

Datasets

  • The model is currently trained on the flowers dataset. Download the images from here and save them in 102flowers/102flowers/*.jpg. Also download the captions from this link. Extract the archive, copy the text_c10 folder and paste it in 102flowers/text_c10/class_*.

N.B You can downloads all data files needed manually or simply run the downloads.py and put the correct files to the right directories.

python downloads.py

Codes

  • downloads.py download Oxford-102 flower dataset and caption files(run this first).
  • data_loader.py load data for further processing.
  • train_txt2im.py train a text to image model.
  • utils.py helper functions.
  • model.py models.

References

Results

  • the flower shown has yellow anther red pistil and bright red petals.
  • this flower has petals that are yellow, white and purple and has dark lines
  • the petals on this flower are white with a yellow center
  • this flower has a lot of small round pink petals.
  • this flower is orange in color, and has petals that are ruffled and rounded.
  • the flower has yellow petals and the center of it is brown
  • this flower has petals that are blue and white.
  • these white flowers have petals that start off white in color and end in a white towards the tips.

License

Apache 2.0

Comments
  • ValueError: Object arrays cannot be loaded when allow_pickle=False

    ValueError: Object arrays cannot be loaded when allow_pickle=False

    File "train_txt2im.py", line 458, in main_train() File "train_txt2im.py", line 133, in main_train load_and_assign_npz(sess=sess, name=net_rnn_name, model=net_rnn) File "train_txt2im.py", line 458, in main_train() File "train_txt2im.py", line 133, in main_train load_and_assign_npz(sess=sess, name=net_rnn_name, model=net_rnn) File "/home/siddanath/importantforprojects/text-to-image/utils.py", line 20, in load_and_assign_npz params = tl.files.load_npz(name=name) File "/home/siddanath/importantforprojects/text-to-image/tensorlayer/files.py", line 600, in load_npz return d['params'] File "/home/siddanath/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", line 262, in getitem pickle_kwargs=self.pickle_kwargs) File "/home/siddanath/anaconda3/lib/python3.7/site-packages/numpy/lib/format.py", line 722, in read_array raise ValueError("Object arrays cannot be loaded when " ValueError: Object arrays cannot be loaded when allow_pickle=False

    opened by Siddanth-pai 2
  • Attempt to have a second RNNCell use the weights of a variable scope that already has weights

    Attempt to have a second RNNCell use the weights of a variable scope that already has weights

    I got a problem, how can I solve it?

    Attempt to have a second RNNCell use the weights of a variable scope that already has weights: 'rnnftxt/rnn/dynamic/rnn/basic_lstm_cell'; and the cell was not constructed as BasicLSTMCell(..., reuse=True). To share the weights of an RNNCell, simply reuse it in your second calculation, or create a new one with the argument reuse=True.

    opened by flsd201983 1
  • Next step after download.py

    Next step after download.py

    What is the next step to do after download.py? I tried python data_loader.py, but it has FileNotFoundError: FileNotFoundError: [Errno 2] No such file or directory: '/home/ly/src/lib/text-to-image/102flowers/text_c10'

    opened by arisliang 0
  • ValueError: invalid literal for int() with base 10: 'e' - when making inference

    ValueError: invalid literal for int() with base 10: 'e' - when making inference

    code -

    sample_sentence = ["a"] * int(sample_size/ni) + ["e"] * int(sample_size/ni) + ["i"] * int(sample_size/ni) + ["o"] * int(sample_size/ni) + ["u"] * int(sample_size/ni)

    for i, sentence in enumerate(sample_sentence): print("seed: %s" % sentence) sentence = preprocess_caption(sentence) sample_sentence[i] = [vocab.word_to_id(word) for word in nltk.tokenize.word_tokenize( sentence)] + [vocab.end_id] # add END_ID

    sample_sentence = tl.prepro.pad_sequences(sample_sentence, padding='post')
    
    img_gen, rnn_out = sess.run([net_g_res.outputs, net_rnn_res.outputs], feed_dict={
        t_real_caption: sample_sentence,
        t_z: sample_seed})
    
    save_images(img_gen, [ni, ni], 'samples/gen_samples/gen.png')
    
    opened by Akinleyejoshua 0
  • Excuse me, why is the flower dataset I test the result is very different from result.png

    Excuse me, why is the flower dataset I test the result is very different from result.png

    import tensorflow as tf import tensorlayer as tl from tensorlayer.layers import * from tensorlayer.prepro import * from tensorlayer.cost import * import numpy as np import scipy from scipy.io import loadmat import time, os, re, nltk

    from utils import * from model import * import model import pickle

    ###======================== PREPARE DATA ====================================### print("Loading data from pickle ...") import pickle with open("_vocab.pickle", 'rb') as f: vocab = pickle.load(f) with open("_image_train.pickle", 'rb') as f: _, images_train = pickle.load(f) with open("_image_test.pickle", 'rb') as f: _, images_test = pickle.load(f) with open("_n.pickle", 'rb') as f: n_captions_train, n_captions_test, n_captions_per_image, n_images_train, n_images_test = pickle.load(f) with open("_caption.pickle", 'rb') as f: captions_ids_train, captions_ids_test = pickle.load(f)

    images_train_256 = np.array(images_train_256)

    images_test_256 = np.array(images_test_256)

    images_train = np.array(images_train) images_test = np.array(images_test)

    ni = int(np.ceil(np.sqrt(batch_size))) save_dir = "checkpoint"

    t_real_image = tf.placeholder('float32', [batch_size, image_size, image_size, 3], name = 'real_image')

    t_real_caption = tf.placeholder(dtype=tf.int64, shape=[batch_size, None], name='real_caption_input')

    t_z = tf.placeholder(tf.float32, [batch_size, z_dim], name='z_noise') generator_txt2img = model.generator_txt2img_resnet

    net_rnn = rnn_embed(t_real_caption, is_train=False, reuse=False) net_g, _ = generator_txt2img(t_z, net_rnn.outputs, is_train=False, reuse=False, batch_size=batch_size)

    sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) tl.layers.initialize_global_variables(sess)

    net_rnn_name = os.path.join(save_dir, 'net_rnn.npz400.npz') net_cnn_name = os.path.join(save_dir, 'net_cnn.npz400.npz') net_g_name = os.path.join(save_dir, 'net_g.npz400.npz') net_d_name = os.path.join(save_dir, 'net_d.npz400.npz')

    net_rnn_res = tl.files.load_and_assign_npz(sess=sess, name=net_rnn_name, network=net_rnn)

    net_g_res = tl.files.load_and_assign_npz(sess=sess, name=net_g_name, network=net_g)

    sample_size = batch_size sample_seed = np.random.normal(loc=0.0, scale=1.0, size=(sample_size, z_dim)).astype(np.float32)

    n = int(sample_size / ni) sample_sentence = ["the flower shown has yellow anther red pistil and bright red petals."] * n +
    ["this flower has petals that are yellow, white and purple and has dark lines"] * n +
    ["the petals on this flower are white with a yellow center"] * n +
    ["this flower has a lot of small round pink petals."] * n +
    ["this flower is orange in color, and has petals that are ruffled and rounded."] * n +
    ["the flower has yellow petals and the center of it is brown."] * n +
    ["this flower has petals that are blue and white."] * n +
    ["these white flowers have petals that start off white in color and end in a white towards the tips."] * n

    for i, sentence in enumerate(sample_sentence): print("seed: %s" % sentence) sentence = preprocess_caption(sentence) sample_sentence[i] = [vocab.word_to_id(word) for word in nltk.tokenize.word_tokenize(sentence)] + [vocab.end_id] # add END_ID

    sample_sentence = tl.prepro.pad_sequences(sample_sentence, padding='post')

    img_gen, rnn_out = sess.run([net_g_res.outputs, net_rnn_res.outputs], feed_dict={ t_real_caption : sample_sentence, t_z : sample_seed})

    save_images(img_gen, [ni, ni], 'samples/gen_samples/gen.png')

    opened by keqkeq 0
  • Tensorflow 2.1, Tensorlayer 2.2 update

    Tensorflow 2.1, Tensorlayer 2.2 update

    Hello,

    are there any plans in the near future to update this git to the latest Tensorflow and Tensorlayer versions? I've been trying making the code run with backwards compat (compat.tf1. ...) but I've keep bumping on errors which are a bit too big of mouth full for me.

    Fyi: I've succesfully run the DCGAN Tensorlayer implementation with Tensorlayer 2.2 and a self build Tensorflow 2.1 (with 3.0 compute compatibility) from source in Python 3.7.

    So, an update would be greatly appreciated!

    opened by SadRebel1000 0
Releases(0.2)
Owner
Hao
Assistant Professor @ Peking University
Hao
a Lightweight library for sequential learning agents, including reinforcement learning

SaLinA: SaLinA - A Flexible and Simple Library for Learning Sequential Agents (including Reinforcement Learning) TL;DR salina is a lightweight library

Facebook Research 405 Dec 17, 2022
Distributed Asynchronous Hyperparameter Optimization better than HyperOpt.

UltraOpt : Distributed Asynchronous Hyperparameter Optimization better than HyperOpt. UltraOpt is a simple and efficient library to minimize expensive

98 Aug 16, 2022
An AFL implementation with UnTracer (our coverage-guided tracer)

UnTracer-AFL This repository contains an implementation of our prototype coverage-guided tracing framework UnTracer in the popular coverage-guided fuz

113 Dec 17, 2022
Pytorch Implementation for CVPR2018 Paper: Learning to Compare: Relation Network for Few-Shot Learning

LearningToCompare Pytorch Implementation for Paper: Learning to Compare: Relation Network for Few-Shot Learning Howto download mini-imagenet and make

Jackie Loong 246 Dec 19, 2022
Face uncertainty quantification or estimation using PyTorch.

Face-uncertainty-pytorch This is a demo code of face uncertainty quantification or estimation using PyTorch. The uncertainty of face recognition is af

Kaen 3 Sep 16, 2022
A python implementation of Yolov5 to detect fire or smoke in the wild in Jetson Xavier nx and Jetson nano

yolov5-fire-smoke-detect-python A python implementation of Yolov5 to detect fire or smoke in the wild in Jetson Xavier nx and Jetson nano You can see

20 Dec 15, 2022
[CoRL 21'] TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo

TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo Lukas Koestler1*    Nan Yang1,2*,†    Niclas Zeller2,3    Daniel Cremers1

TUM Computer Vision Group 744 Jan 04, 2023
Classical OCR DCNN reproduction based on PaddlePaddle framework.

Paddle-SVHN Classical OCR DCNN reproduction based on PaddlePaddle framework. This project reproduces Multi-digit Number Recognition from Street View I

1 Nov 12, 2021
This repository contains demos I made with the Transformers library by HuggingFace.

Transformers-Tutorials Hi there! This repository contains demos I made with the Transformers library by 🤗 HuggingFace. Currently, all of them are imp

3.5k Jan 01, 2023
A library for answering questions using data you cannot see

A library for computing on data you do not own and cannot see PySyft is a Python library for secure and private Deep Learning. PySyft decouples privat

OpenMined 8.5k Jan 02, 2023
The code for the NeurIPS 2021 paper "A Unified View of cGANs with and without Classifiers".

Energy-based Conditional Generative Adversarial Network (ECGAN) This is the code for the NeurIPS 2021 paper "A Unified View of cGANs with and without

sianchen 22 May 28, 2022
BboxToolkit is a tiny library of special bounding boxes.

BboxToolkit is a light codebase collecting some practical functions for the special-shape detection, such as oriented detection

jbwang1997 73 Jan 01, 2023
这是一个yolox-keras的源码,可以用于训练自己的模型。

YOLOX:You Only Look Once目标检测模型在Keras当中的实现 目录 性能情况 Performance 实现的内容 Achievement 所需环境 Environment 小技巧的设置 TricksSet 文件下载 Download 训练步骤 How2train 预测步骤 Ho

Bubbliiiing 64 Nov 10, 2022
Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification [pdf] The official repository for Self-Supervised Pre-Training for Transfo

Hao Luo 116 Jan 04, 2023
Building blocks for uncertainty-aware cycle consistency presented at NeurIPS'21.

UncertaintyAwareCycleConsistency This repository provides the building blocks and the API for the work presented in the NeurIPS'21 paper Robustness vi

EML Tübingen 19 Dec 12, 2022
Old Photo Restoration (Official PyTorch Implementation)

Bringing Old Photo Back to Life (CVPR 2020 oral)

Microsoft 11.3k Dec 30, 2022
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m

Xin Li 15 Nov 26, 2022
Moiré Attack (MA): A New Potential Risk of Screen Photos [NeurIPS 2021]

Moiré Attack (MA): A New Potential Risk of Screen Photos [NeurIPS 2021] This repository is the official implementation of Moiré Attack (MA): A New Pot

Dantong Niu 22 Dec 24, 2022
This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning].

CG3 This is the repository for the AAAI 21 paper [Contrastive and Generative Graph Convolutional Networks for Graph-based Semi-Supervised Learning]. R

12 Oct 28, 2022
Steerable discovery of neural audio effects

Steerable discovery of neural audio effects Christian J. Steinmetz and Joshua D. Reiss Abstract Applications of deep learning for audio effects often

Christian J. Steinmetz 182 Dec 29, 2022