Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Last update: Dec 15, 2022

Overview

This is a fork of Fairseq(-py) with implementations of the following models:

Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction

An NMT models with two-dimensional convolutions to jointly encode the source and the target sequences.

Pervasive Attention also provides an extensive decoding grid that we leverage to efficiently train wait-k models.

See README.

Efficient Wait-k Models for Simultaneous Machine Translation

Transformer Wait-k models (Ma et al., 2019) with unidirectional encoders and with joint training of multiple wait-k paths.

See README.

Fairseq Requirements and Installation

PyTorch version >= 1.4.0
Python version >= 3.6
For training new models, you'll also need an NVIDIA GPU and NCCL

Installing Fairseq

git clone https://github.com/elbayadm/attn2d
cd attn2d
pip install --editable .

License

fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.

Citation

For Pervasive Attention, please cite:

@InProceedings{elbayad18conll,
    author ="Elbayad, Maha and Besacier, Laurent and Verbeek, Jakob",
    title = "Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction",
    booktitle = "Proceedings of the 22nd Conference on Computational Natural Language Learning",
    year = "2018",
 }

For our wait-k models, please cite:

@article{elbayad20waitk,
    title={Efficient Wait-k Models for Simultaneous Machine Translation},
    author={Elbayad, Maha and Besacier, Laurent and Verbeek, Jakob},
    journal={arXiv preprint arXiv:2005.08595},
    year={2020}
}

For Fairseq, please cite:

@inproceedings{ott2019fairseq,
  title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
  author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
  booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
  year = {2019},
}

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Related tags

Overview

Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction

Efficient Wait-k Models for Simultaneous Machine Translation

Fairseq Requirements and Installation

License

Citation

Owner

Maha

This project converts your human voice input to its text transcript and to an automated voice too.

Negative sampling for solving the unlabeled entity problem in NER. ICLR-2021 paper: Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition.

End-to-end text to speech system using gruut and onnx. There are 40 voices available across 8 languages.

Python library for Serbian Natural language processing (NLP)

nlabel is a library for generating, storing and retrieving tagging information and embedding vectors from various nlp libraries through a unified interface.

QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries

Converts text into a PDF of handwritten notes

Beautiful visualizations of how language differs among document types.

Fast, general, and tested differentiable structured prediction in PyTorch

New Modeling The Background CodeBase

Implementation of "Adversarial purification with Score-based generative models", ICML 2021

Code for our paper "Transfer Learning for Sequence Generation: from Single-source to Multi-source" in ACL 2021.

To classify the News into Real/Fake using Features from the Text Content of the article

Host your own GPT-3 Discord bot

Sentiment-Analysis and EDA on the IMDB Movie Review Dataset

Repository for fine-tuning Transformers 🤗 based seq2seq speech models in JAX/Flax.

BERT, LDA, and TFIDF based keyword extraction in Python

🏆 • 5050 most frequent words in 109 languages

This code extends the neural style transfer image processing technique to video by generating smooth transitions between several reference style images

Fine-tune GPT-3 with a Google Chat conversation history