Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Last update: Dec 15, 2022

Overview

This is a fork of Fairseq(-py) with implementations of the following models:

Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction

An NMT models with two-dimensional convolutions to jointly encode the source and the target sequences.

Pervasive Attention also provides an extensive decoding grid that we leverage to efficiently train wait-k models.

See README.

Efficient Wait-k Models for Simultaneous Machine Translation

Transformer Wait-k models (Ma et al., 2019) with unidirectional encoders and with joint training of multiple wait-k paths.

See README.

Fairseq Requirements and Installation

PyTorch version >= 1.4.0
Python version >= 3.6
For training new models, you'll also need an NVIDIA GPU and NCCL

Installing Fairseq

git clone https://github.com/elbayadm/attn2d
cd attn2d
pip install --editable .

License

fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.

Citation

For Pervasive Attention, please cite:

@InProceedings{elbayad18conll,
    author ="Elbayad, Maha and Besacier, Laurent and Verbeek, Jakob",
    title = "Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction",
    booktitle = "Proceedings of the 22nd Conference on Computational Natural Language Learning",
    year = "2018",
 }

For our wait-k models, please cite:

@article{elbayad20waitk,
    title={Efficient Wait-k Models for Simultaneous Machine Translation},
    author={Elbayad, Maha and Besacier, Laurent and Verbeek, Jakob},
    journal={arXiv preprint arXiv:2005.08595},
    year={2020}
}

For Fairseq, please cite:

@inproceedings{ott2019fairseq,
  title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
  author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
  booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
  year = {2019},
}

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Related tags

Overview

Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction

Efficient Wait-k Models for Simultaneous Machine Translation

Fairseq Requirements and Installation

License

Citation

Owner

Maha

This repository holds code and data for our PETS'22 article 'From "Onion Not Found" to Guard Discovery'.

MiraiML: asynchronous, autonomous and continuous Machine Learning in Python

"Exploring Vision Transformers for Fine-grained Classification" at CVPRW FGVC8

A FAIR dataset of TCV experimental results for validating edge/divertor turbulence models.

Keyhole Imaging: Non-Line-of-Sight Imaging and Tracking of Moving Objects Along a Single Optical Path

Implementation of paper "DCS-Net: Deep Complex Subtractive Neural Network for Monaural Speech Enhancement"

Python Blood Vessel Topology Analysis

Computational modelling of ray propagation through optical elements using the principles of geometric optics (Ray Tracer)

Code for testing various M1 Chip benchmarks with TensorFlow.

OOD Dataset Curator and Benchmark for AI-aided Drug Discovery

An Artificial Intelligence trying to drive a car by itself on a user created map

Package for working with hypernetworks in PyTorch.

A python script to dump all the challenges locally of a CTFd-based Capture the Flag.

Code for "Layered Neural Rendering for Retiming People in Video."

A strongly-typed genetic programming framework for Python

(ICONIP 2020) MobileHand: Real-time 3D Hand Shape and Pose Estimation from Color Image

This repo holds codes of the ICCV21 paper: Visual Alignment Constraint for Continuous Sign Language Recognition.

Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks

Code I use to automatically update my videos' metadata on YouTube

An investigation project for SISR.