Code for training and evaluation of the model from "Language Generation with Recurrent Generative Adversarial Networks without Pre-training"

Overview

Language Generation with Recurrent Generative Adversarial Networks without Pre-training

Code for training and evaluation of the model from "Language Generation with Recurrent Generative Adversarial Networks without Pre-training".

A short summary of the paper is available here.

Sample outputs (32 chars)

" There has been to be a place w
On Friday , the stories in Kapac
From should be taken to make it 
He is conference for the first t
For a lost good talks to ever ti

Training

To start training the CL+VL+TH model, first download the dataset, available at http://www.statmt.org/lm-benchmark/, and extract it into the ./data directory.

Then use the following command:

python curriculum_training.py

The following packages are required:

  • Python 2.7
  • Tensorflow 1.1
  • Scipy
  • Matplotlib

The following parameters can be configured:

LOGS_DIR: Path to save model checkpoints and samples during training (defaults to './logs/')
DATA_DIR: Path to load the data from (defaults to './data/1-billion-word-language-modeling-benchmark-r13output/')
CKPT_PATH: Path to checkpoint file when restoring a saved model
BATCH_SIZE: Size of batch (defaults to 64)
CRITIC_ITERS: Number of iterations for the discriminator (defaults to 10)
GEN_ITERS: Number of iterations for the geneartor (defaults to 50)
MAX_N_EXAMPLES: Number of samples to load from dataset (defaults to 10000000)
GENERATOR_MODEL: Name of generator model (currently only 'Generator_GRU_CL_VL_TH' is available)
DISCRIMINATOR_MODEL: Name of discriminator model (currently only 'Discriminator_GRU' is available)
PICKLE_PATH: Path to PKL directory to hold cached pickle files (defaults to './pkl')
ITERATIONS_PER_SEQ_LENGTH: Number of iterations to run per each sequence length in the curriculum training (defaults to 15000)
NOISE_STDEV: Standard deviation for the noise vector (defaults to 10.0)
DISC_STATE_SIZE: Discriminator GRU state size (defaults to 512)
GEN_STATE_SIZE: Genarator GRU state size (defaults to 512)
TRAIN_FROM_CKPT: Boolean, set to True to restore from checkpoint (defaults to False)
GEN_GRU_LAYERS: Number of GRU layers for the genarator (defaults to 1)
DISC_GRU_LAYERS: Number of GRU layers for the discriminator (defaults to 1)
START_SEQ: Sequence length to start the curriculum learning with (defaults to 1)
END_SEQ: Sequence length to end the curriculum learning with (defaults to 32)
SAVE_CHECKPOINTS_EVERY: Save checkpoint every # steps (defaults to 25000)
LIMIT_BATCH: Boolean that indicates whether to limit the batch size  (defaults to true)

Parameters can be set by either changing their value in the config file or by passing them in the terminal:

python curriculum_training.py --START_SEQ=1 --END_SEQ=32

Generating text

The generate.py script will generate BATCH_SIZE samples using a saved model. It should be run using the parameters used to train the model (if they are different than the default values). For example:

python generate.py --CKPT_PATH=/path/to/checkpoint/seq-32/ckp --DISC_GRU_LAYERS=2 --GEN_GRU_LAYERS=2

(If your model has not reached stage 32 in the curriculum, make sure to change the '32' in the path above to the maximal stage in the curriculum that your model trained on.)

Evaluating text

To evaluate samples using our %-IN-TEST-n metrics, use the following command, linking to a txt file where each row is a sample:

python evaluate.py --INPUT_SAMPLE=/path/to/samples.txt

Reference

If you found this code useful, please cite the following paper:

@article{press2017language,
  title={Language Generation with Recurrent Generative Adversarial Networks without Pre-training},
  author={Press, Ofir and Bar, Amir and Bogin, Ben and Berant, Jonathan and Wolf, Lior},
  journal={arXiv preprint arXiv:1706.01399},
  year={2017}
}

Acknowledgments

This repository is based on the code published in Improved Training of Wasserstein GANs.

[NeurIPS 2021] “Improving Contrastive Learning on Imbalanced Data via Open-World Sampling”,

Improving Contrastive Learning on Imbalanced Data via Open-World Sampling Introduction Contrastive learning approaches have achieved great success in

VITA 24 Dec 17, 2022
Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing

Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing Paper Introduction Multi-task indoor scene understanding is widely considered a

62 Dec 05, 2022
This is an open source library implementing hyperbox-based machine learning algorithms

hyperbox-brain is a Python open source toolbox implementing hyperbox-based machine learning algorithms built on top of scikit-learn and is distributed

Complex Adaptive Systems (CAS) Lab - University of Technology Sydney 21 Dec 14, 2022
Neuron class provides LNU (Linear Neural Unit), QNU (Quadratic Neural Unit), RBF (Radial Basis Function), MLP (Multi Layer Perceptron), MLP-ELM (Multi Layer Perceptron - Extreme Learning Machine) neurons learned with Gradient descent or LeLevenberg–Marquardt algorithm

Neuron class provides LNU (Linear Neural Unit), QNU (Quadratic Neural Unit), RBF (Radial Basis Function), MLP (Multi Layer Perceptron), MLP-ELM (Multi Layer Perceptron - Extreme Learning Machine) neu

Filip Molcik 38 Dec 17, 2022
A Python training and inference implementation of Yolov5 helmet detection in Jetson Xavier nx and Jetson nano

yolov5-helmet-detection-python A Python implementation of Yolov5 to detect head or helmet in the wild in Jetson Xavier nx and Jetson nano. In Jetson X

12 Dec 05, 2022
Why Are You Weird? Infusing Interpretability in Isolation Forest for Anomaly Detection

Why, hello there! This is the supporting notebook for the research paper — Why Are You Weird? Infusing Interpretability in Isolation Forest for Anomal

2 Dec 14, 2021
MAVE: : A Product Dataset for Multi-source Attribute Value Extraction

The dataset contains 3 million attribute-value annotations across 1257 unique categories on 2.2 million cleaned Amazon product profiles. It is a large, multi-sourced, diverse dataset for product attr

Google Research Datasets 89 Jan 08, 2023
Construct a neural network frame by Numpy

本项目的CSDN博客链接:https://blog.csdn.net/weixin_41578567/article/details/111482022 1. 概览 本项目主要用于神经网络的学习,通过基于numpy的实现,了解神经网络底层前向传播、反向传播以及各类优化器的原理。 该项目目前已实现的功

24 Jan 22, 2022
A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

Use this instead: https://github.com/facebookresearch/maskrcnn-benchmark A Pytorch Implementation of Detectron Example output of e2e_mask_rcnn-R-101-F

Roy 2.8k Dec 29, 2022
OneFlow is a performance-centered and open-source deep learning framework.

OneFlow OneFlow is a performance-centered and open-source deep learning framework. Latest News Version 0.5.0 is out! First class support for eager exe

OneFlow 4.2k Jan 07, 2023
A Repository of Community-Driven Natural Instructions

A Repository of Community-Driven Natural Instructions TLDR; this repository maintains a community effort to create a large collection of tasks and the

AI2 244 Jan 04, 2023
Memory Defense: More Robust Classificationvia a Memory-Masking Autoencoder

Memory Defense: More Robust Classificationvia a Memory-Masking Autoencoder Authors: - Eashan Adhikarla - Dan Luo - Dr. Brian D. Davison Abstract Many

Eashan Adhikarla 4 Dec 25, 2022
Keeping it safe - AI Based COVID-19 Tracker using Deep Learning and facial recognition

Keeping it safe - AI Based COVID-19 Tracker using Deep Learning and facial recognition

Vansh Wassan 15 Jun 17, 2021
[AAAI 2022] Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification

Sparse Structure Learning via Graph Neural Networks for inductive document classification Make graph dataset create co-occurrence graph for datasets.

16 Dec 22, 2022
An SE(3)-invariant autoencoder for generating the periodic structure of materials

Crystal Diffusion Variational AutoEncoder This software implementes Crystal Diffusion Variational AutoEncoder (CDVAE), which generates the periodic st

Tian Xie 94 Dec 10, 2022
Official codebase for ICLR oral paper Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling

CLIORA This is the official codebase for ICLR oral paper: Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling. We introduce

Bo Wan 32 Dec 23, 2022
Neural Contours: Learning to Draw Lines from 3D Shapes (CVPR2020)

Neural Contours: Learning to Draw Lines from 3D Shapes This repository contains the PyTorch implementation for CVPR 2020 Paper "Neural Contours: Learn

93 Dec 16, 2022
Learning infinite-resolution image processing with GAN and RL from unpaired image datasets, using a differentiable photo editing model.

Exposure: A White-Box Photo Post-Processing Framework ACM Transactions on Graphics (presented at SIGGRAPH 2018) Yuanming Hu1,2, Hao He1,2, Chenxi Xu1,

Yuanming Hu 719 Dec 29, 2022
Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis [Paper] [Online Demo] The following results are obtained by our SCUNet with purely syn

Kai Zhang 312 Jan 07, 2023
A library for performing coverage guided fuzzing of neural networks

TensorFuzz: Coverage Guided Fuzzing for Neural Networks This repository contains a library for performing coverage guided fuzzing of neural networks,

Brain Research 195 Dec 28, 2022