The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Last update: Nov 21, 2022

Overview

Language Models are Few-shot Multilingual Learners

Paper

This is the source code of the paper [Arxiv] [ACL Anthology]:

This code has been written using PyTorch. If you use source codes or datasets included in this toolkit in your work, please cite the following paper:

@inproceedings{winata-etal-2021-language,
    title = "Language Models are Few-shot Multilingual Learners",
    author = "Winata, Genta Indra  and
      Madotto, Andrea  and
      Lin, Zhaojiang  and
      Liu, Rosanne  and
      Yosinski, Jason  and
      Fung, Pascale",
    booktitle = "Proceedings of the 1st Workshop on Multilingual Representation Learning",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.mrl-1.1",
    pages = "1--15",
}

Setup Environment

GPU Machine

pip install -r requirements.txt

GPU Machine for Running GPT-J 6B Model

apt install zstd

# the "slim" version contain only bf16 weights and no optimizer parameters, which minimizes bandwidth and memory
wget -c https://the-eye.eu/public/AI/GPT-J-6B/step_383500_slim.tar.zstd

tar -I zstd -xf step_383500_slim.tar.zstd

pip install -r mesh_transformer_jax/requirements.txt

# jax 0.2.12 is required due to a regression with xmap in 0.2.13
pip install mesh-transformer-jax/ jax==0.2.12

# cuda[your_cuda_version]
pip install jaxlib==0.1.67+cuda101 -f https://storage.googleapis.com/jax-releases/jax_releases.html

How to run

Zero-shot Cross-task

❱❱❱ CUDA_VISIBLE_DEVICES=0 python evaluate.py  --dataset snips --model_checkpoint facebook/bart-large-mnli --cuda --length 5 --label_type value --src_lang en --tgt_lang en --seed 42 --use_log_prob --use_confidence --is_cross_task

Finetune

❱❱❱ CUDA_VISIBLE_DEVICES=0 python finetune.py  --dataset snips --model_checkpoint bert-base-multilingual-uncased --cuda --label_type value --src_lang en --tgt_lang en --seed 42

The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Related tags

Overview

Language Models are Few-shot Multilingual Learners

Paper

Setup Environment

GPU Machine

GPU Machine for Running GPT-J 6B Model

How to run

Zero-shot Cross-task

Finetune

Owner

Genta Indra Winata

AI Assistant for Building Reliable, High-performing and Fair Multilingual NLP Systems

Reformer, the efficient Transformer, in Pytorch

用Resnet101+GPT搭建一个玩王者荣耀的AI

Speech to text streamlit app

A Word Level Transformer layer based on PyTorch and 🤗 Transformers.

HAN2HAN : Hangul Font Generation

Topic Inference with Zeroshot models

BERT, LDA, and TFIDF based keyword extraction in Python

NLPretext packages in a unique library all the text preprocessing functions you need to ease your NLP project.

CCF BDCI 2020 房产行业聊天问答匹配赛道 A榜47/2985

PatrickStar enables Larger, Faster, Greener Pretrained Models for NLP. Democratize AI for everyone.

Code for the paper PermuteFormer

Pretrain CPM - 大规模预训练语言模型的预训练代码

A text file containing 479k English words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion

Python functions for summarizing and improving voice dictation input.

A minimal code for fairseq vq-wav2vec model inference.

Blender addon - Scrub timeline from viewport with a shortcut

Named Entity Recognition API used by TEI Publisher

Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recognition

Active learning for text classification in Python