Code for ICLR 2021 Paper, "Anytime Sampling for Autoregressive Models via Ordered Autoencoding"

Overview

Anytime Autoregressive Model

Anytime Sampling for Autoregressive Models via Ordered Autoencoding , ICLR 21

​ Yilun Xu, Yang Song, Sahaj Gara, Linyuan Gong, Rui Shu, Aditya Grover, Stefano Ermon

A new family of autoregressive model that enables anytime sampling​! 😃

Experiment 1: Image generation

Training:

  • Step 1: Pretrain VQ-VAE with full code length:
python vqvae.py --hidden-size latent-size --k codebook-size --dataset name-of-dataset --data-folder paht-to-dataset  --out-path path-to-model --pretrain

latent-size: latent code length
codebook-size: codebook size
name-of-dataset: mnist / cifar10 / celeba
path-to-dataset: path to the roots of dataset
path-to-model: path to save checkpoints
  • Step 2: Train ordered VQ-VAE:
python vqvae.py --hidden-size latent-size --k codebook-size --dataset name-of-dataset --data-folder paht-to-dataset  --out-path path-to-model --restore-checkpoint path-to-checkpoint --lr learning-rate

latent-size: latent code length
codebook-size: codebook size
name-of-dataset: mnist / cifar10 / celeba
path-to-dataset: path to the roots of dataset
path-to-model: path to save checkpoints
path-to-checkpoint: the path of the best checkpoint in Step 1
learning-rate: learning rate (recommended:1e-3)

  • Step 3: Train autoregressive model
python train_ar.py --task integer_sequence_modeling \
path-to-dumped-codes --vocab-size codebook-size --tokens-per-sample latent-size \
--ae-dataset name-of-dataset --ae-data-path path to the roots of dataset --ae-checkpoint path-to-checkpoint --ae-batch-size 512 \
--arch transformer_lm --dropout dropout-rate --attention-dropout dropout-rate --activation-dropout dropout-rate \
--optimizer adam --adam-betas '(0.9, 0.98)' --adam-eps 1e-6 --weight-decay 0.1 --clip-norm 0.0 \
--lr 0.002 --lr-scheduler inverse_sqrt --warmup-updates 3000 --warmup-init-lr 1e-07 \
--max-sentences ar-batch-size \
--fp16 \
--max-update iterations \
--seed 2 \
--log-format json --log-interval 10000000 --no-epoch-checkpoints --no-last-checkpoints \
--save-dir path-to-model

path-to-dumped-codes: path to the dumped codes of VQ-VAE model (fasten training process)
dropout-rate: dropout rate
latent-size: latent code length
codebook-size: codebook size
name-of-dataset: mnist / cifar10 / celeba
path-to-dataset: path to the roots of dataset
path-to-model: path to save checkpoints
path-to-checkpoint: the path of the best checkpoint in Step 2
ar-batch-size: batch size of autorregressive model
iterations: training iterations

Anytime sampling (Inference):

python3 generate.py --n-samples number-of-samples --out-path paht-to-img \
--tokens-per-sample latent-size --vocab-size codebook-size --tokens-per-target code-num \
--ae-checkpoint path-to-ae --ae-batch-size 512 \
--ar-checkpoint path-to-ar --ar-batch-size batch-size
(--ae_celeba --ae_mnist)
number-of-samples: number of samples to be generated
path-to-img: path to the generated samples
latent-size: latent code length
codebook-size: codebook size
code-num: number of codes used to generated (Anytime sampling!)
path-to-ae: path to the VQ-VAE checkpoint in Step 2
path-to-ar: path to the Transformer checkpoint in Step 3
batch-size: batch size for Transforer
ae_celeba: store_true for generating CelebA
ae_mnist: store_true for generating mnist

Experiment 2: Audio Generation

Firstly cd audio-wave/src.

Training:

  • Step 1: Pretrain VQ-VAE with full code length:
python3 main.py -ex ../configuration/experimens_wave_vq_whole_bigger.jason
  • Step 2: Train ordered VQ-VAE:
python3 main.py -ex ../configuration/experimens_wave_vq_whole_bigger_u.json
  • Step 3: Training Transformerr models:

    • A more step: dump the codebook by: (Will merge this step in future version)
    python3 main.py -ex ../configuration/experimens_wave_vq_whole_bigger_u.json --dump
python train_ar.py --task integer_sequence_modeling \
path-to-dumped-codes --vocab-size codebook-size --tokens-per-sample latent-size \
--arch transformer_lm --dropout dropout-rate --attention-dropout dropout-rate --activation-dropout dropout-rate \
--optimizer adam --adam-betas '(0.9, 0.98)' --adam-eps 1e-6 --weight-decay 0.1 --clip-norm 0.0 \
--lr 0.002 --lr-scheduler inverse_sqrt --warmup-updates 3000 --warmup-init-lr 1e-07 \
--max-sentences ar-batch-size \
--fp16 \
--max-update iterations \
--seed 2 \
--log-format json --log-interval 10000000 --no-epoch-checkpoints --no-last-checkpoints \
--save-dir path-to-model

path-to-dumped-codes: path to the dumped codes of VQ-VAE model (fasten training process)
dropout-rate: dropout rate
latent-size: latent code length
codebook-size: codebook size
name-of-dataset: mnist / cifar10 / celeba
path-to-dataset: path to the roots of dataset
path-to-model: path to save checkpoints
ar-batch-size: batch size of autorregressive model
iterations: training iterations

Anytime sampling (Inference):

python3 generate.py --n-samples number-of-samples --out-path paht-to-img \
--tokens-per-sample latent-size --vocab-size codebook-size --tokens-per-target code-num \
--ar-checkpoint path-to-ar --ar-batch-size batch-size

number-of-samples: number of samples to be generated
path-to-img: path to the generated samples
latent-size: latent code length
codebook-size: codebook size
code-num: number of codes used to generated (Anytime sampling!)
path-to-ar: path to the Transformer checkpoint in Step 3
batch-size: batch size for Transforer

Citation

@inproceedings{
xu2021anytime,
title={Anytime Sampling for Autoregressive Models via Ordered Autoencoding},
author={Yilun Xu and Yang Song and Sahaj Garg and Linyuan Gong and Rui Shu and Aditya Grover and Stefano Ermon},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=TSRTzJnuEBS}
}
Owner
Yilun Xu
Yilun Xu
Interpolation-based reduced-order models

Interpolation-reduced-order-models Interpolation-based reduced-order models High-fidelity computational fluid dynamics (CFD) solutions are time consum

Donovan Blais 1 Jan 10, 2022
QTool: A Low-bit Quantization Toolbox for Deep Neural Networks in Computer Vision

This project provides abundant choices of quantization strategies (such as the quantization algorithms, training schedules and empirical tricks) for quantizing the deep neural networks into low-bit c

Monash Green AI Lab 51 Dec 10, 2022
This repository provides the code for MedViLL(Medical Vision Language Learner).

MedViLL This repository provides the code for MedViLL(Medical Vision Language Learner). Our proposed architecture MedViLL is a single BERT-based model

SuperSuperMoon 39 Jan 05, 2023
Example-custom-ml-block-keras - Custom Keras ML block example for Edge Impulse

Custom Keras ML block example for Edge Impulse This repository is an example on

Edge Impulse 8 Nov 02, 2022
Small little script to scrape, parse and check for active tor nodes. Can be used as proxies.

TorScrape TorScrape is a small but useful script made in python that scrapes a website for active tor nodes, parse the html and then save the nodes in

5 Dec 04, 2022
Repositorio oficial del curso IIC2233 Programación Avanzada 🚀✨

IIC2233 - Programación Avanzada Evaluación Las evaluaciones serán efectuadas por medio de actividades prácticas en clases y tareas. Se calculará la no

IIC2233 @ UC 0 Dec 15, 2022
Pre-trained model, code, and materials from the paper "Impact of Adversarial Examples on Deep Learning Models for Biomedical Image Segmentation" (MICCAI 2019).

Adaptive Segmentation Mask Attack This repository contains the implementation of the Adaptive Segmentation Mask Attack (ASMA), a targeted adversarial

Utku Ozbulak 53 Jul 04, 2022
some classic model used to segment the medical images like CT、X-ray and so on

github_project This is a project for medical image segmentation. This project includes common medical image segmentation models such as U-net, FCN, De

2 Mar 30, 2022
Creating multimodal multitask models

Fusion Brain Challenge The English version of the document can be found here. Обновления 01.11 Мы выкладываем пример данных, аналогичных private test

Sber AI 43 Nov 28, 2022
Delving into Localization Errors for Monocular 3D Object Detection, CVPR'2021

Delving into Localization Errors for Monocular 3D Detection By Xinzhu Ma, Yinmin Zhang, Dan Xu, Dongzhan Zhou, Shuai Yi, Haojie Li, Wanli Ouyang. Intr

XINZHU.MA 124 Jan 04, 2023
Analysis of Smiles through reservoir sampling & RDkit

Analysis of Smiles through reservoir sampling and machine learning (under development). This is a simple project that includes two Jupyter files for t

Aurimas A. Nausėdas 6 Aug 30, 2022
A fast MoE impl for PyTorch

An easy-to-use and efficient system to support the Mixture of Experts (MoE) model for PyTorch.

Rick Ho 873 Jan 09, 2023
This repository contains the source code of an efficient 1D probabilistic model for music time analysis proposed in ICASSP2022 venue.

Jump Reward Inference for 1D Music Rhythmic State Spaces An implementation of the probablistic jump reward inference model for music rhythmic informat

Mojtaba Heydari 25 Dec 16, 2022
[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

Counterfactual Attention Learning Created by Yongming Rao*, Guangyi Chen*, Jiwen Lu, Jie Zhou This repository contains PyTorch implementation for ICCV

Yongming Rao 90 Dec 31, 2022
Automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azure

fwhr-calc-website This project is to automatically measure the facial Width-To-Height ratio and get facial analysis results provided by Microsoft Azur

SoohyunPark 1 Feb 07, 2022
Useful materials and tutorials for 110-1 NTU DBME5028 (Application of Deep Learning in Medical Imaging)

Useful materials and tutorials for 110-1 NTU DBME5028 (Application of Deep Learning in Medical Imaging)

7 Jun 22, 2022
This repository implements variational graph auto encoder by Thomas Kipf.

Variational Graph Auto-encoder in Pytorch This repository implements variational graph auto-encoder by Thomas Kipf. For details of the model, refer to

DaehanKim 215 Jan 02, 2023
A method that utilized Generative Adversarial Network (GAN) to interpret the black-box deep image classifier models by PyTorch.

A method that utilized Generative Adversarial Network (GAN) to interpret the black-box deep image classifier models by PyTorch.

Yunxia Zhao 3 Dec 29, 2022
A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

NeRF Minimal Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Result of Tiny-NeRF RGB Depth

Soumik Rakshit 11 Jul 24, 2022
SciPy fixes and extensions

scipyx SciPy is large library used everywhere in scientific computing. That's why breaking backwards-compatibility comes as a significant cost and is

Nico Schlömer 16 Jul 17, 2022