The implementation of Parameter Differentiation based Multilingual Neural Machine Translation

Last update: Dec 17, 2022

Overview

The implementation of Parameter Differentiation based Multilingual Neural Machine Translation .

Requirement:

apex
fairseq
scikit-learn
pytorch

Process data following https://github.com/pytorch/fairseq/tree/main/examples/translation#multilingual-translation.
Training:

data_bin=    # data path 
lang_pairs=  # comma separated language pairs

fairseq-train $data_path \
    --task parameter_differentiation_task --lang-pairs $lang_pairs --encoder-langtok tgt \
    --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
    --optimizer adam --lr 0.0015 --adam-betas '(0.9,0.98)' \
    --lr-scheduler inverse_sqrt --warmup-updates 4000 --warmup-init-lr 1e-07 \
    --arch parameter_differentiation_base_model \
    --max-tokens 8192 \
    --user-dir $PWD

Decoding

source_lang=
target_lang=
model_path=
fairseq-generate $data_path --path $model_path \
    --task parameter_differentiation_task --lang-pairs $lang_pairs --encoder-langtok tgt \
    --beam 4 --lenpen 0.6 --remove-bpe sentencepiece \
    --source-lang $source_lang --target-lang $target_lang > result.$source_lang-$target_lang.txt

The implementation of Parameter Differentiation based Multilingual Neural Machine Translation

Related tags

Overview

Owner

Qian Wang

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

Revisiting Pre-trained Models for Chinese Natural Language Processing (Findings of EMNLP 2020)

A fast hierarchical dimensionality reduction algorithm.

[NeurIPS 2021] Code for Learning Signal-Agnostic Manifolds of Neural Fields

NLP topic mdel LDA - Gathered from New York Times website

I can help you convert your images to pdf file.

LeBenchmark: a reproducible framework for assessing SSL from speech

Natural Language Processing Best Practices & Examples

This project deals with a simplified version of a more general problem of Aspect Based Sentiment Analysis.

State of the Art Natural Language Processing

Client library to download and publish models and other files on the huggingface.co hub

Trains an OpenNMT PyTorch model and SentencePiece tokenizer.

An easy to use, user-friendly and efficient code for extracting OpenAI CLIP (Global/Grid) features from image and text respectively.

A pytorch implementation of the ACL2019 paper "Simple and Effective Text Matching with Richer Alignment Features".

The simple project to separate mixed voice (2 clean voices) to 2 separate voices.

Chatbot for the Chatango messaging platform

Write Python in Urdu - اردو میں کوڈ لکھیں

GVT is a generic translation tool for parts of text on the PC screen with Text to Speak functionality.

Facilitating the design, comparison and sharing of deep text matching models.

Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention