Code for the Findings of NAACL 2022(Long Paper): AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

Last update: Nov 12, 2022

Overview

AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

arXiv link: upcoming

To be published in Findings of NAACL 2022

Authors: Chin-Lun Fu*, Zih-Ching Chen*, Yun-Ru Lee, Hung-yi Lee

Overview

In this study, AdapterBias, a surprisingly simple yet effective adapter architecture, is proposed. AdapterBias adds a token-dependent shift to the hidden output of transformer layers to adapt to downstream tasks with only a vector and a linear layer.

Dataset

We use GLUE Benchmark as our dataset. You can download all datasets from the website.

Training

cd src
python exp.py \
    --adapter True \
    --GLUE_path <ur_GLUE_path> \
    --output_path <output_path> \
    --model <model name> \
    --task <the task u want to run> \
    --epoch 100 \
    --lr 0.0001 \
    --max_len 512 \
    --batch_size 32 \

-s or --seed specifies the random seed
-g or --GLUE_path specifies the path of your GLUE dataset.
-o or --output_path specifies the path of saved model and saved predicted file.
-m or --model specifies the pre-trained language model (PLM) you used in training.
- Some examples: bert-base, bert-large, roberta-base, roberta-large
-t or --task specifies the downstream task.
- Some examples: cola, mnli, qnli, qqp, mrpc, rte, sst, sts
-a or --adapter specifies whether you adding our AdapterBias in PLM
--share_alpha specifies whether you share the same alpha in AdapterBias in all transformer layers

Inference

After you run the training, you can automatically get the prediction file in <output_path>/result/. Also, the saved model is in <output_path>/model/.

Running all nine tasks of GLUE benchmark, you can sumbit the prediction files to the website.

Code for the Findings of NAACL 2022(Long Paper): AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

Related tags

Overview

AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

Overview

Dataset

Training

Inference

Owner

Allen

Machine translation models released by the Gourmet project

Multilingual word vectors in 78 languages

Language-Agnostic SEntence Representations

Text editor on python to convert english text to malayalam(Romanization/Transiteration).

Semi-automated vocabulary generation from semantic vector models

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

This is a simple item2vec implementation using gensim for recbole

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

SAVI2I: Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors

HuggingTweets - Train a model to generate tweets

VoiceFixer VoiceFixer is a framework for general speech restoration.

Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.

An evaluation toolkit for voice conversion models.

Pytorch-version BERT-flow: One can apply BERT-flow to any PLM within Pytorch framework.

2021 AI CUP Competition on Traditional Chinese Scene Text Recognition - Intermediate Contest

AEC_DeepModel - Deep learning based acoustic echo cancellation baseline code

TLA - Twitter Linguistic Analysis

This repository is home to the Optimus data transformation plugins for various data processing needs.

Stand-alone language identification system

A versatile token stream for handwritten parsers.