The code for two papers: Feedback Transformer and Expire-Span.

Last update: Dec 25, 2022

Related tags

Text Data & NLP transformer-sequential

Overview

transformer-sequential

This repo contains the code for two papers:

Feedback Transformer
Expire-Span

The training code is structured for long sequential modeling with Transformer-like architectures.

Requirements

You will need a CUDA-enabled GPU to run the code.

Setup

Run the following:

pip install -r requirements.txt

Feedback Transformer

Introduced in Addressing Some Limitations of Transformers with Feedback Memory.

Running Experiments from the Paper

enwik8

Model	Params	Valid	Test
Feedback Transformer	77M	0.984	0.962

Numbers are Bits-Per-Character

bash experiments/feedback/enwik8.sh

Algorithmic

Model	3 Variable	5 Variable
Transformer	33.7	37.5
Feedback Transformer	99.1	92.6

Numbers are % Accuracy on Test

bash experiments/feedback/algorithmic_3var.sh
bash experiments/feedback/algorithmic_5var.sh

Expire-Span

Introduced in Not All Memories are Created Equal: Learning to Expire.

Running Experiments from the Paper

enwik8

Model	Params	Valid	Test
Expire-Span 12L	38M	1.014	0.994

Numbers are Bits-Per-Character

bash experiments/expire_span/enwik8.sh

Object Collision

Model	Maximum Span	Test Error (%)
Expire-Span	16k	52.2
Expire-Span	32k	36.7
Expire-Span	64k	26.7

bash experiments/expire_span/object_collision_16k.sh
bash experiments/expire_span/object_collision_32k.sh
bash experiments/expire_span/object_collision_64k.sh

License

The code is licensed under CC-BY-NC license. See the LICENSE file for more details.

The code for two papers: Feedback Transformer and Expire-Span.

Related tags

Overview

transformer-sequential

Requirements

Setup

Feedback Transformer

Running Experiments from the Paper

enwik8

Algorithmic

Expire-Span

Running Experiments from the Paper

enwik8

Object Collision

License

Owner

Meta Research

Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format

Data manipulation and transformation for audio signal processing, powered by PyTorch

A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

基于“Seq2Seq+前缀树”的知识图谱问答

Official code for "Parser-Free Virtual Try-on via Distilling Appearance Flows", CVPR 2021

Practical Machine Learning with Python

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

A notebook that shows how to import the IITB English-Hindi Parallel Corpus from the HuggingFace datasets repository

A text file containing 479k English words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion

Language-Agnostic SEntence Representations

Mlcode - Continuous ML API Integrations

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

Translators - is a library which aims to bring free, multiple, enjoyable translation to individuals and students in Python

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

Twitter Sentiment Analysis using #tag, words and username

Wikipedia-Utils: Preprocessing Wikipedia Texts for NLP

A complete NLP guideline for enthusiasts

Entity Disambiguation as text extraction (ACL 2022)

Python interface for converting Penn Treebank trees to Stanford Dependencies and Universal Depenencies