Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

Last update: Dec 24, 2022

Overview

efficient-task-transfer

This repository contains code for the experiments in our paper "What to Pre-Train on? Efficient Intermediate Task Selection". Most importantly, this includes scripts for easy training of Transformers and Adapters across a wide range of NLU tasks.

Overview

The repository is structured as follows:

itrain holds the itrain package which allows easy setup, training and evaluation of Transformers and Adapters
run_configs provides default training configuration of all tasks currently supported by itrain
training_scripts provides scripts for sequential adapter fine-tuning and adapter fusion as used in the paper
task_selection provides scripts used for intermediate task selection in the paper

Setup & Requirements

The code in this repository was developed using Python v3.6.8, PyTorch v1.7.1 and adapter-transformers v1.1.1, which is based on HuggingFace Transformers v3.5.1. Using version different from the ones specified might not work.

After setting up Python and PyTorch (ideally in a virtual environment), all additional requirements together with the itrain package can be installed using:

pip install -e .

Additional setup steps required for running some scripts are detailed below locations.

Transformer & Adapter Training

The itrain package provides a simple interface for configuring Transformer and Adapter training runs. itrain provides tools for:

downloading and preprocessing datasets via HuggingFace datasets
setting up Transformers and Adapter training
training and evaluating on different tasks
notifying on training start and results via mail or Telegram

itrain can be invoked from the command line by passing a run configuration file in json format. Example configurations for all currently supported tasks can be found in the run_configs folder. All supported configuration keys are defined in arguments.py.

Running a setup from the command line can look like this:

itrain --id 42 run_configs/sst2.json

This will train an adapter on the SST-2 task using robert-base as the base model (as specified in the config file).

Besides modifying configuration keys directly in the json file, they can be overriden using command line parameters. E.g., we can modify the previous training run to fully fine-tune a bert-base-uncased model:

itrain --id <run_id> \
    --model_name_or_path bert-base-uncased \
    --train_adapter false \
    --learning_rate 3e-5 \
    --num_train_epochs 3 \
    --patience 0 \
    run_configs/<task>.json

Alternatively, training setups can be configured directly in Python by using the Setup class of itrain. An example for this is given in example.py.

Intermediate Task Transfer & Task Selection Experiments

Some scripts that helped running experiments presented in "What to Pre-Train on? Efficient Intermediate Task Selection" are provided:

See training_scripts for details on intermediate task transfer using sequential fine-tuning or adapter fusion
See task_selection for details on intermediate task selection methods.

All these scripts rely on pre-trained models/ adapters as described above and the following additional setup.

Setup

We used a configuration file to specify the pre-trained models/ adapters and tasks to be used as transfer sources and transfer targets for different task transfer strategies and task selection methods. The full configuration as used in the paper is given in task_map.json. It has to be modified to use self-trained models/ adapters:

from and to specify which tasks are used as transfer source and transfer targets (names as defined in run_configs)
source_path_format and target_path_format specify templates for the locations of pre-trained models/ adapters
adapters provides a mapping from pre-trained (source) models/ adapters to run ids

Finally, the path to this task map and the folder holding the run configurations have to be made available to the scripts:

export RUN_CONFIG_DIR="/path/to/run_configs"
export DEFAULT_TASK_MAP="/path/to/task_map.json"

Credits

huggingface/transformers for the Transformers implementations and the trainer class
huggingface/datasets for dataset downloading and preprocessing
tuvuumass/task-transferability for the TextEmb and TaskEmb implementations
Adapter-Hub/adapter-transformers for the adapter implementation

Citation

If you find this repository helpful, please cite our paper "What to Pre-Train on? Efficient Intermediate Task Selection":

@inproceedings{poth-etal-2021-what-to-pre-train-on,
    title={What to Pre-Train on? Efficient Intermediate Task Selection},
    author={Clifton Poth and Jonas Pfeiffer and Andreas Rücklé and Iryna Gurevych},
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    month = nov,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/2104.08247",
    pages = "to appear",
}

Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

Related tags

Overview

efficient-task-transfer

Overview

Setup & Requirements

Transformer & Adapter Training

Intermediate Task Transfer & Task Selection Experiments

Setup

Credits

Citation

Owner

AdapterHub

scikit-learn wrappers for Python fastText.

NSFW A chatbot based on GPT2-chitchat

KR-FinBert And KR-FinBert-SC

[NeurIPS 2021] Code for Learning Signal-Agnostic Manifolds of Neural Fields

Trained T5 and T5-large model for creating keywords from text

CMeEE 数据集医学实体抽取

Synthetic data for the people.

Dope Wars game engine on StarkNet L2 roll-up

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

A NLP program: tokenize method, PoS Tagging with deep learning

A music comments dataset, containing 39,051 comments for 27,384 songs.

Russian GPT3 models.

Rethinking the Truly Unsupervised Image-to-Image Translation - Official PyTorch Implementation (ICCV 2021)

Fake news detector filters - Smart filter project allow to classify the quality of information and web pages

Extract rooms type, door, neibour rooms, rooms corners nad bounding boxes, and generate graph from rplan dataset

[EMNLP 2021] Mirror-BERT: Converting Pretrained Language Models to universal text encoders without labels.

Neural network sequence labeling model

Paddle2.x version AI-Writer

📝An easy-to-use package to restore punctuation of the text.

AI and Machine Learning workflows on Anthos Bare Metal.

Research code for "What to Pre-Train on? Efficient Intermediate Task Selection", EMNLP 2021

Related tags

Overview

efficient-task-transfer

Overview

Setup & Requirements

Transformer & Adapter Training

Intermediate Task Transfer & Task Selection Experiments

Setup

Credits

Citation

Owner

AdapterHub

scikit-learn wrappers for Python fastText.

**NSFW** A chatbot based on GPT2-chitchat

KR-FinBert And KR-FinBert-SC

[NeurIPS 2021] Code for Learning Signal-Agnostic Manifolds of Neural Fields

Trained T5 and T5-large model for creating keywords from text

CMeEE 数据集医学实体抽取

Synthetic data for the people.

Dope Wars game engine on StarkNet L2 roll-up

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

A NLP program: tokenize method, PoS Tagging with deep learning

A music comments dataset, containing 39,051 comments for 27,384 songs.

Russian GPT3 models.

Rethinking the Truly Unsupervised Image-to-Image Translation - Official PyTorch Implementation (ICCV 2021)

Fake news detector filters - Smart filter project allow to classify the quality of information and web pages

Extract rooms type, door, neibour rooms, rooms corners nad bounding boxes, and generate graph from rplan dataset

[EMNLP 2021] Mirror-BERT: Converting Pretrained Language Models to universal text encoders without labels.

Neural network sequence labeling model

Paddle2.x version AI-Writer

📝An easy-to-use package to restore punctuation of the text.

AI and Machine Learning workflows on Anthos Bare Metal.

NSFW A chatbot based on GPT2-chitchat