Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

Overview

efficient-task-transfer

This repository contains code for the experiments in our paper "What to Pre-Train on? Efficient Intermediate Task Selection". Most importantly, this includes scripts for easy training of Transformers and Adapters across a wide range of NLU tasks.

Overview

The repository is structured as follows:

  • itrain holds the itrain package which allows easy setup, training and evaluation of Transformers and Adapters
  • run_configs provides default training configuration of all tasks currently supported by itrain
  • training_scripts provides scripts for sequential adapter fine-tuning and adapter fusion as used in the paper
  • task_selection provides scripts used for intermediate task selection in the paper

Setup & Requirements

The code in this repository was developed using Python v3.6.8, PyTorch v1.7.1 and adapter-transformers v1.1.1, which is based on HuggingFace Transformers v3.5.1. Using version different from the ones specified might not work.

After setting up Python and PyTorch (ideally in a virtual environment), all additional requirements together with the itrain package can be installed using:

pip install -e .

Additional setup steps required for running some scripts are detailed below locations.

Transformer & Adapter Training

The itrain package provides a simple interface for configuring Transformer and Adapter training runs. itrain provides tools for:

  • downloading and preprocessing datasets via HuggingFace datasets
  • setting up Transformers and Adapter training
  • training and evaluating on different tasks
  • notifying on training start and results via mail or Telegram

itrain can be invoked from the command line by passing a run configuration file in json format. Example configurations for all currently supported tasks can be found in the run_configs folder. All supported configuration keys are defined in arguments.py.

Running a setup from the command line can look like this:

itrain --id 42 run_configs/sst2.json

This will train an adapter on the SST-2 task using robert-base as the base model (as specified in the config file).

Besides modifying configuration keys directly in the json file, they can be overriden using command line parameters. E.g., we can modify the previous training run to fully fine-tune a bert-base-uncased model:

itrain --id <run_id> \
    --model_name_or_path bert-base-uncased \
    --train_adapter false \
    --learning_rate 3e-5 \
    --num_train_epochs 3 \
    --patience 0 \
    run_configs/<task>.json

Alternatively, training setups can be configured directly in Python by using the Setup class of itrain. An example for this is given in example.py.

Intermediate Task Transfer & Task Selection Experiments

Some scripts that helped running experiments presented in "What to Pre-Train on? Efficient Intermediate Task Selection" are provided:

  • See training_scripts for details on intermediate task transfer using sequential fine-tuning or adapter fusion
  • See task_selection for details on intermediate task selection methods.

All these scripts rely on pre-trained models/ adapters as described above and the following additional setup.

Setup

We used a configuration file to specify the pre-trained models/ adapters and tasks to be used as transfer sources and transfer targets for different task transfer strategies and task selection methods. The full configuration as used in the paper is given in task_map.json. It has to be modified to use self-trained models/ adapters:

  • from and to specify which tasks are used as transfer source and transfer targets (names as defined in run_configs)
  • source_path_format and target_path_format specify templates for the locations of pre-trained models/ adapters
  • adapters provides a mapping from pre-trained (source) models/ adapters to run ids

Finally, the path to this task map and the folder holding the run configurations have to be made available to the scripts:

export RUN_CONFIG_DIR="/path/to/run_configs"
export DEFAULT_TASK_MAP="/path/to/task_map.json"

Credits

Citation

If you find this repository helpful, please cite our paper "What to Pre-Train on? Efficient Intermediate Task Selection":

@inproceedings{poth-etal-2021-what-to-pre-train-on,
    title={What to Pre-Train on? Efficient Intermediate Task Selection},
    author={Clifton Poth and Jonas Pfeiffer and Andreas Rücklé and Iryna Gurevych},
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    month = nov,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2104.08247",
    pages = "to appear",
}
Owner
AdapterHub
AdapterHub
An easy to use, user-friendly and efficient code for extracting OpenAI CLIP (Global/Grid) features from image and text respectively.

Extracting OpenAI CLIP (Global/Grid) Features from Image and Text This repo aims at providing an easy to use and efficient code for extracting image &

Jianjie(JJ) Luo 13 Jan 06, 2023
StarGAN - Official PyTorch Implementation

StarGAN - Official PyTorch Implementation ***** New: StarGAN v2 is available at https://github.com/clovaai/stargan-v2 ***** This repository provides t

Yunjey Choi 5.1k Dec 30, 2022
AI-Broad-casting - AI Broad casting with python

Basic Code 1. Use The Code Configuration Environment conda create -n code_base p

Anomaly Detection 이상치 탐지 전처리 모듈

Anomaly Detection 시계열 데이터에 대한 이상치 탐지 1. Kernel Density Estimation을 활용한 이상치 탐지 train_data_path와 test_data_path에 존재하는 시점 정보를 포함하고 있는 csv 형태의 train data와

CLUST-consortium 43 Nov 28, 2022
DeepAmandine is an artificial intelligence that allows you to talk to it for hours, you won't know the difference.

DeepAmandine This is an artificial intelligence based on GPT-3 that you can chat with, it is very nice and makes a lot of jokes. We wish you a good ex

BuyWithCrypto 3 Apr 19, 2022
Huggingface Transformers + Adapters = ❤️

adapter-transformers A friendly fork of HuggingFace's Transformers, adding Adapters to PyTorch language models adapter-transformers is an extension of

AdapterHub 1.2k Jan 09, 2023
A list of NLP(Natural Language Processing) tutorials built on Tensorflow 2.0.

A list of NLP(Natural Language Processing) tutorials built on Tensorflow 2.0.

Won Joon Yoo 335 Jan 04, 2023
Pretrained Japanese BERT models

Pretrained Japanese BERT models This is a repository of pretrained Japanese BERT models. The models are available in Transformers by Hugging Face. Mod

Inui Laboratory 387 Dec 30, 2022
A repo for materials relating to the tutorial of CS-332 NLP

CS-332-NLP A repo for materials relating to the tutorial of CS-332 NLP Contents Tutorial 1: Introduction Corpus Regular expression Tokenization Tutori

Alok singh 9 Feb 15, 2022
This is a project of data parallel that running on NLP tasks.

This is a project of data parallel that running on NLP tasks.

2 Dec 12, 2021
Source code of paper "BP-Transformer: Modelling Long-Range Context via Binary Partitioning"

BP-Transformer This repo contains the code for our paper BP-Transformer: Modeling Long-Range Context via Binary Partition Zihao Ye, Qipeng Guo, Quan G

Zihao Ye 119 Nov 14, 2022
Original implementation of the pooling method introduced in "Speaker embeddings by modeling channel-wise correlations"

Speaker-Embeddings-Correlation-Pooling This is the original implementation of the pooling method introduced in "Speaker embeddings by modeling channel

Themos Stafylakis 10 Apr 30, 2022
Spert NLP Relation Extraction API deployed with torchserve for inference

URLMask Python program for Linux users to change a URL to ANY domain. A program than can take any url and mask it to any domain name you like. E.g. ne

Zichu Chen 1 Nov 24, 2021
KoBERT - Korean BERT pre-trained cased (KoBERT)

KoBERT KoBERT Korean BERT pre-trained cased (KoBERT) Why'?' Training Environment Requirements How to install How to use Using with PyTorch Using with

SK T-Brain 1k Jan 02, 2023
Code for the project carried out fulfilling the course requirements for Fall 2021 NLP at NYU

Introduction Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization,

Sai Himal Allu 1 Apr 25, 2022
Behavioral Testing of Clinical NLP Models

Behavioral Testing of Clinical NLP Models This repository contains code for testing the behavior of clinical prediction models based on patient letter

Betty van Aken 2 Sep 20, 2022
SimpleChinese2 集成了许多基本的中文NLP功能,使基于 Python 的中文文字处理和信息提取变得简单方便。

SimpleChinese2 SimpleChinese2 集成了许多基本的中文NLP功能,使基于 Python 的中文文字处理和信息提取变得简单方便。 声明 本项目是为方便个人工作所创建的,仅有部分代码原创。

Ming 30 Dec 02, 2022
Unsupervised Language Modeling at scale for robust sentiment classification

** DEPRECATED ** This repo has been deprecated. Please visit Megatron-LM for our up to date Large-scale unsupervised pretraining and finetuning code.

NVIDIA Corporation 1k Nov 17, 2022