Code for Findings at EMNLP 2021 paper: "Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning"

Related tags

Text Data & NLPCLIF
Overview

Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning

This repo is for Findings at EMNLP 2021 paper: Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning. Code clean-up is still in progress.

Data

Please extract the downloaded data and place it under PROJECT_DIR/datasets. Our training data stream and few-shot datasets are curated from https://github.com/iesl/leopard and https://github.com/INK-USC/CrossFit.

The directory structure is

PROJECT_DIR/datasets/crossfit_data/data/ + 55 classification tasks from the link above, e.g. PROJECT_DIR/datasets/crossfit_data/data/anli
PROJECT_DIR/datasets/leopard/ + 17 tasks from the link above, e.g. PROJECT_DIR/datasets/leopard/airline

Environment

Our code uses PyTorch 1.7.1. To allow fp16 training, you should also install apex.

Running Experiments

Training on CLIF-26

reg=0.01
lr=1e-4
seed=0
python run_model.py --tasks cola sst2 mrpc stsb qqp mnli qnli rte wnli \
--output_dir runs/glue_cfew_10k_choice_hnet_hardlong_sample_reg${reg}_s64_d256_limit/${lr}/${seed} \
--do_train --eval_period 100000 --eval_at_epoch_end  --wait_step 3 --num_train_epochs 100 --seed ${seed} \
--train_batch_size 64 --gradient_accumulation_steps 2 --learning_rate ${lr} --max_output_length 8 \
--generator_hdim 32 --example_limit 100 --train_limit 10000 --cl_method hnet --h_l2reg ${reg} \
--adapter_dim 256 --adapter_dim_final 64  --hard_long_term  --limit_label_vocab_space \
--sample_batch --scale_loss --stm_size 64

Few-shot evaluation on CLIF-26

python run_model.py --task_collection leopard --k_shot 16 --max_input_length 100  \
--output_dir /runs/glue_cfew_10k_choice_hnet_hardlong_sample_reg${reg}_s64_d256_limit/${lr}/${seed} \
--do_few_shot_predict --eval_period 100000 --eval_at_epoch_end  --wait_step 3 --num_train_epochs 100 \
--seed ${seed} --train_batch_size 64 --predict_batch_size 16 --few_shot_train_batch_size 16 \
--few_shot_wait_step 100000 --few_shot_num_train_epochs 800 --wait_step 3 --gradient_accumulation_steps 4 \
--scale_by_accumulation --learning_rate ${lr} --max_output_length 8  --generator_hdim 32 \
--example_limit 100 --train_limit 10000 --cl_method naive --h_l2reg ${reg} --adapter_dim 256 \
--adapter_dim_final 64 --hard_long_term --limit_label_vocab_space --no_short_term --long_term_task_emb_num 9 \
--postfix "naive_16shot"  --sample_batch --stm_size 64 --few_shot_eval_period 200

Training and evaluation on CLIF-55

reg=0.01
lr=1e-4
seed=0
python run_model.py  --task_collection crossfit_cls_train --crossfit_k_shot 16 --ssd --output_dir runs/crossfit_hnet_merge_space_${reg}/${lr}/${seed} --skip_intermediate_ckpt --add_space --merge_split --split_id ${seed} --seed ${seed} --do_train --eval_every_k_tasks 5 --eval_period 100 --skip_intermediate_ckpt --train_batch_size 64 --wait_step 3 --num_train_epochs 10000000  --learning_rate ${lr} --max_output_length 64 --example_limit 100 --train_limit 10000 --cl_method hnet --h_l2reg ${reg} --adapter_dim 256 --generator_hdim 32 --adapter_dim_final 64 --sample_batch --hard_long_term --stm_size 64
python run_model.py --task_collection crossfit_cls_train --crossfit_k_shot 16 --ssd --output_dir runs/crossfit_hnet_merge_space${reg}/${lr}/${seed} --skip_intermediate_ckpt --add_space --merge_split --split_id ${seed} --seed ${seed} --do_predict --eval_every_k_tasks 5 --eval_period 100 --skip_intermediate_ckpt --train_batch_size 64 --wait_step 3 --num_train_epochs 10000000  --learning_rate ${lr} --max_output_length 64 --example_limit 100 --train_limit 10000 --cl_method hnet --h_l2reg ${reg} --adapter_dim 256 --generator_hdim 32 --adapter_dim_final 64 --sample_batch --hard_long_term --stm_size 64
for split_id in 0 1 2 3 4
do
  python run_model.py --task_collection crossfit_cls_test --crossfit_k_shot 16 --ssd --postfix "split${split_id}"  --long_term_task_emb_num 45 --do_few_shot_predict --few_shot_eval_period 200 --few_shot_num_train_epochs 800 --few_shot_train_batch_size 64 --few_shot_wait_step 100 --mtl_task_num 45 --output_dir runs/crossfit_hnet_merge_space_${reg}/${lr}/${seed} --add_space  --limit_label_vocab_space --split_id ${split_id} --seed ${seed} --eval_period 100 --train_batch_size 64 --gradient_accumulation_steps 1 --wait_step 6 --num_train_epochs 10000  --learning_rate ${lr} --max_output_length 64 --example_limit 100 --train_limit 10000 --cl_method naive --adapter_dim 256 --generator_hdim 32 --adapter_dim_final 64 --sample_batch --hard_long_term
done

Here are mapping between command line arguments and implemented methods.

  • BART-Single without adapter: --cl_method naive --no_param_gen --skip_adapter --train_all
  • BART-Single-MTL: --cl_method naive --no_param_gen --skip_mtl --mtl --train_all
  • BiHNET-Vanilla: --cl_method naive --hard_long_term
  • BiHNET with trained task embeddings: --cl_method hnet --no_short_term --train_task_embs --hard_long_term
  • BART-Adapter-Single: --cl_method naive --no_param_gen --lr 3e-4
Owner
INK Lab @ USC
Intelligence and Knowledge Discovery (INK) Research Lab at University of Southern California
INK Lab @ USC
Spokestack is a library that allows a user to easily incorporate a voice interface into any Python application with a focus on embedded systems.

Welcome to Spokestack Python! This library is intended for developing voice interfaces in Python. This can include anything from Raspberry Pi applicat

Spokestack 133 Sep 20, 2022
Trained T5 and T5-large model for creating keywords from text

text to keywords Trained T5-base and T5-large model for creating keywords from text. Supported languages: ru Pretraining Large version | Pretraining B

Danil 61 Nov 24, 2022
Code for Emergent Translation in Multi-Agent Communication

Emergent Translation in Multi-Agent Communication PyTorch implementation of the models described in the paper Emergent Translation in Multi-Agent Comm

Facebook Research 75 Jul 15, 2022
AMUSE - financial summarization

AMUSE AMUSE - financial summarization Unzip data.zip Train new model: python FinAnalyze.py --task train --start 0 --count how many files,-1 for all

1 Jan 11, 2022
Python code for ICLR 2022 spotlight paper EViT: Expediting Vision Transformers via Token Reorganizations

Expediting Vision Transformers via Token Reorganizations This repository contain

Youwei Liang 101 Dec 26, 2022
Code for the Findings of NAACL 2022(Long Paper): AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks arXiv link: upcoming To be published in Findings of NA

Allen 16 Nov 12, 2022
🤕 spelling exceptions builder for lazy people

🤕 spelling exceptions builder for lazy people

Vlad Bokov 3 May 12, 2022
Implementation of legal QA system based on SentenceKoBART

LegalQA using SentenceKoBART Implementation of legal QA system based on SentenceKoBART How to train SentenceKoBART Based on Neural Search Engine Jina

Heewon Jeon(gogamza) 75 Dec 27, 2022
The ibet-Prime security token management system for ibet network.

ibet-Prime The ibet-Prime security token management system for ibet network. Features ibet-Prime is an API service that enables the issuance and manag

BOOSTRY 8 Dec 22, 2022
Fine-tuning scripts for evaluating transformer-based models on KLEJ benchmark.

The KLEJ Benchmark Baselines The KLEJ benchmark (Kompleksowa Lista Ewaluacji Językowych) is a set of nine evaluation tasks for the Polish language und

Allegro Tech 17 Oct 18, 2022
NeoDays-based tileset for the roguelike CDDA (Cataclysm Dark Days Ahead)

NeoDaysPlus Reduced contrast, expanded, and continuously developed version of the CDDA tileset NeoDays that's being completed with new sprites for mis

0 Nov 12, 2022
Basic Utilities for PyTorch Natural Language Processing (NLP)

Basic Utilities for PyTorch Natural Language Processing (NLP) PyTorch-NLP, or torchnlp for short, is a library of basic utilities for PyTorch NLP. tor

Michael Petrochuk 2.1k Jan 01, 2023
Free and Open Source Machine Translation API. 100% self-hosted, offline capable and easy to setup.

LibreTranslate Try it online! | API Docs | Community Forum Free and Open Source Machine Translation API, entirely self-hosted. Unlike other APIs, it d

3.4k Dec 27, 2022
A python package for deep multilingual punctuation prediction.

This python library predicts the punctuation of English, Italian, French and German texts. We developed it to restore the punctuation of transcribed spoken language.

Oliver Guhr 27 Dec 22, 2022
Summarization module based on KoBART

KoBART-summarization Install KoBART pip install git+https://github.com/SKT-AI/KoBART#egg=kobart Requirements pytorch==1.7.0 transformers==4.0.0 pytor

seujung hwan, Jung 148 Dec 28, 2022
Russian GPT3 models.

Russian GPT-3 models (ruGPT3XL, ruGPT3Large, ruGPT3Medium, ruGPT3Small) trained with 2048 sequence length with sparse and dense attention blocks. We also provide Russian GPT-2 large model (ruGPT2Larg

Sberbank AI 1.6k Jan 05, 2023
FedNLP: A Benchmarking Framework for Federated Learning in Natural Language Processing

FedNLP is a research-oriented benchmarking framework for advancing federated learning (FL) in natural language processing (NLP). It uses FedML repository as the git submodule. In other words, FedNLP

FedML-AI 216 Nov 27, 2022
Implementaion of our ACL 2022 paper Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation This is the implementaion of our paper: Bridging the

hezw.tkcw 20 Dec 12, 2022
Framework for fine-tuning pretrained transformers for Named-Entity Recognition (NER) tasks

NERDA Not only is NERDA a mesmerizing muppet-like character. NERDA is also a python package, that offers a slick easy-to-use interface for fine-tuning

Ekstra Bladet 141 Dec 30, 2022
Crowd sourced training data for Rasa NLU models

NLU Training Data Crowd-sourced training data for the development and testing of Rasa NLU models. If you're interested in grabbing some data feel free

Rasa 169 Dec 26, 2022