The code for two papers: Feedback Transformer and Expire-Span.

Last update: Dec 25, 2022

Related tags

Deep Learning transformer-sequential

Overview

transformer-sequential

This repo contains the code for two papers:

Feedback Transformer
Expire-Span

The training code is structured for long sequential modeling with Transformer-like architectures.

Requirements

You will need a CUDA-enabled GPU to run the code.

Setup

Run the following:

pip install -r requirements.txt

Feedback Transformer

Introduced in Addressing Some Limitations of Transformers with Feedback Memory.

Running Experiments from the Paper

enwik8

Model	Params	Valid	Test
Feedback Transformer	77M	0.984	0.962

Numbers are Bits-Per-Character

bash experiments/feedback/enwik8.sh

Algorithmic

Model	3 Variable	5 Variable
Transformer	33.7	37.5
Feedback Transformer	99.1	92.6

Numbers are % Accuracy on Test

bash experiments/feedback/algorithmic_3var.sh
bash experiments/feedback/algorithmic_5var.sh

Expire-Span

Introduced in Not All Memories are Created Equal: Learning to Expire.

Running Experiments from the Paper

enwik8

Model	Params	Valid	Test
Expire-Span 12L	38M	1.014	0.994

Numbers are Bits-Per-Character

bash experiments/expire_span/enwik8.sh

Object Collision

Model	Maximum Span	Test Error (%)
Expire-Span	16k	52.2
Expire-Span	32k	36.7
Expire-Span	64k	26.7

bash experiments/expire_span/object_collision_16k.sh
bash experiments/expire_span/object_collision_32k.sh
bash experiments/expire_span/object_collision_64k.sh

License

The code is licensed under CC-BY-NC license. See the LICENSE file for more details.

The code for two papers: Feedback Transformer and Expire-Span.

Related tags

Overview

transformer-sequential

Requirements

Setup

Feedback Transformer

Running Experiments from the Paper

enwik8

Algorithmic

Expire-Span

Running Experiments from the Paper

enwik8

Object Collision

License

Owner

Facebook Research

Implementation of character based convolutional neural network

The PyTorch improved version of TPAMI 2017 paper: Face Alignment in Full Pose Range: A 3D Total Solution.

The Fundamental Clustering Problems Suite (FCPS) summaries 54 state-of-the-art clustering algorithms, common cluster challenges and estimations of the number of clusters as well as the testing for cluster tendency.

A simple pytorch pipeline for semantic segmentation.

Project for music generation system based on object tracking and CGAN

Colab notebook for openai/glide-text2im.

Pytorch Implementation of Various Point Transformers

The Multi-Mission Maximum Likelihood framework (3ML)

Turning SymPy expressions into PyTorch modules.

Interactive Image Segmentation via Backpropagating Refinement Scheme

Tightness-aware Evaluation Protocol for Scene Text Detection

A collection of loss functions for medical image segmentation

Key information extraction from invoice document with Graph Convolution Network

Domain Generalization with MixStyle, ICLR'21.

VolumeGAN - 3D-aware Image Synthesis via Learning Structural and Textural Representations

ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels

PyTorch implementation of the Flow Gaussian Mixture Model (FlowGMM) model from our paper

To build a regression model to predict the concrete compressive strength based on the different features in the training data.

Pytorch implementation of the popular Improv RNN model originally proposed by the Magenta team.

This repository contains a Ruby API for utilizing TensorFlow.