Implementation of Multistream Transformers in Pytorch

Last update: Jul 26, 2022

Overview

Multistream Transformers

Implementation of Multistream Transformers in Pytorch.

This repository deviates slightly from the paper, where instead of using the skip connection across all streams, it uses attention pooling across all tokens in the same position. This has produced the best results in my experiments with number of streams greater than 2.

Install

$ pip install multistream-transformers

Usage

import torch
from multistream_transformers import MultistreamTransformer

model = MultistreamTransformer(
    num_tokens = 256,         # number of tokens
    dim = 512,                # dimension
    depth = 4,                # depth
    causal = True,            # autoregressive or not
    max_seq_len = 1024,       # maximum sequence length
    num_streams = 2           # number of streams - 1 would make it a regular transformer
)

x = torch.randint(0, 256, (2, 1024))
mask = torch.ones((2, 1024)).bool()

logits = model(x, mask = mask) # (2, 1024, 256)

Citations

@misc{burtsev2021multistream,
    title   = {Multi-Stream Transformers}, 
    author  = {Mikhail Burtsev and Anna Rumshisky},
    year    = {2021},
    eprint  = {2107.10342},
    archivePrefix = {arXiv},
    primaryClass = {cs.CL}
}

Big Bird: Transformers for Longer Sequences

BigBird, is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences. Moreover, BigBird comes along with a theoretical understanding of the capabilities of a complete transformer that the sparse model can handle.

457 Dec 23, 2022

:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

Haystack is an end-to-end framework for Question Answering & Neural search that enables you to ... ... ask questions in natural language and find gran

6.4k Jan 9, 2023

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr

1.2k Jan 8, 2023

spaCy plugin for Transformers , Udify, ELmo, etc.

Camphr - spaCy plugin for Transformers, Udify, Elmo, etc. Camphr is a Natural Language Processing library that helps in seamless integration for a wid

342 Nov 21, 2022

:mag: End-to-End Framework for building natural language search interfaces to data by utilizing Transformers and the State-of-the-Art of NLP. Supporting DPR, Elasticsearch, HuggingFace’s Modelhub and much more!

Haystack is an end-to-end framework that enables you to build powerful and production-ready pipelines for different search use cases. Whether you want

1.4k Feb 18, 2021

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and GPT-NEO (2.7 B) on a single 16 GB VRAM V100 Google Cloud instance with Huggingface Transformers using DeepSpeed

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and GPT-NEO (2.7 Billion Parameters) on a single 16 GB VRAM V100 Google Cloud instance with Huggingfa

289 Jan 6, 2023

Implementation of Multistream Transformers in Pytorch

Related tags

Overview

Multistream Transformers

Install

Usage

Citations

You might also like...

Big Bird: Transformers for Longer Sequences

:mag: Transformers at scale for question answering & neural search. Using NLP via a modular Retriever-Reader-Pipeline. Supporting DPR, Elasticsearch, HuggingFace's Modelhub...

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spaCy plugin for Transformers , Udify, ELmo, etc.

:mag: End-to-End Framework for building natural language search interfaces to data by utilizing Transformers and the State-of-the-Art of NLP. Supporting DPR, Elasticsearch, HuggingFace’s Modelhub and much more!

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spaCy plugin for Transformers , Udify, ELmo, etc.

A deep learning-based translation library built on Huggingface transformers

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and GPT-NEO (2.7 B) on a single 16 GB VRAM V100 Google Cloud instance with Huggingface Transformers using DeepSpeed

Releases(0.0.4)

0.0.4(Jul 31, 2021)

0.0.3(Jul 31, 2021)

0.0.2(Jul 30, 2021)

0.0.1(Jul 30, 2021)

Owner

Phil Wang

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!

Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.

A repo for open resources & information for people to succeed in PhD in CS & career in AI / NLP

Control the classic General Instrument SP0256-AL2 speech chip and AY-3-8910 sound generator with a Raspberry Pi and this Python library.

Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities"

The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Original implementation of the pooling method introduced in "Speaker embeddings by modeling channel-wise correlations"

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Speach Recognitions

Data and code to support "Applied Natural Language Processing" (INFO 256, Fall 2021, UC Berkeley)

The guide to tackle with the Text Summarization

A repo for materials relating to the tutorial of CS-332 NLP

Long text token classification using LongFormer

Official implementation of MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis

This program do translate english words to portuguese

Programme de chiffrement et de déchiffrement inverse d'un message en python3.

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

DeepAmandine is an artificial intelligence that allows you to talk to it for hours, you won't know the difference.

NLP Overview