The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Last update: Dec 14, 2022

Related tags

Overview

Graformer

The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Graformer (also named BridgeTransformer in the code) is a sequence-to-sequence model mainly for Neural Machine Translation. We improve the multilingual translation by taking advantage of pre-trained (masked) language models, including pre-trained encoder (BERT) and pre-trained decoder (GPT). The code is based on Fairseq.

Examples

You can start with run/run.sh, with some minor modification. The corresponding scripts represent:

train a pre-trained BERT:
    run_arnold_multilingual_masked_lm_6e6d.sh

train a pre-trained GPT:
    run_arnold_multilingual_lm_6e6d.sh

train a Graformer:
    run_arnold_multilingual_graft_transformer_12e12d_ted.sh

inference from Graformer:
    run_arnold_multilingual_graft_inference_ted.sh

Released Models

We release our pre-trained mBERT and mGPT, along with the trained Graformer model in here.

Tensorflow Version

We will provide the tensorflow version in Neurst, a popular toolkit for sequence processing.

Citation

Please cite as:

@inproceedings{sun2021mulilingual,
    title = "Multilingual Translation via Grafting Pre-trained Language Models",
    author = "Sun, Zewei and Wang, Mingxuan and Li, Lei",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    year = "2021"
}

Contact

If you have any questions, please feel free to contact me: [email protected]

The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Related tags

Overview

Graformer

Examples

Released Models

Tensorflow Version

Citation

Contact

Owner

NeuTex: Neural Texture Mapping for Volumetric Neural Rendering

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

Arabic speech recognition, classification and text-to-speech.

Tools, wrappers, etc... for data science with a concentration on text processing

Nested Named Entity Recognition

An open-source NLP research library, built on PyTorch.

Automatic privilege escalation for misconfigured capabilities, sudo and suid binaries

Awesome Treasure of Transformers Models Collection

Korea Spell Checker

An open collection of annotated voices in Japanese language

PG-19 Language Modelling Benchmark

code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Fidibo.com comments Sentiment Analyser

PyTorch implementation of the NIPS-17 paper "Poincaré Embeddings for Learning Hierarchical Representations"

Making text a first-class citizen in TensorFlow.

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Speech Recognition for Uyghur using Speech transformer

Tool which allow you to detect and translate text.

PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset