Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

Overview

DART

Implementation for ICLR2022 paper Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners.

Environment

  • [email protected]
  • Use pip install -r requirements.txt to install dependencies.
  • wandb account is required if the user wants to search for best hyper-parameter combinations.

Data source

  • 16-shot GLUE dataset from LM-BFF.
  • Generated data consists of 5 random splits (13/21/42/87/100) for a task, each has 16 samples.

How to run

  • To run across each 5 splits in a task, use run.py:
    • In the arguments, encoder="inner" is the method proposed in the paper where verbalizers are other trainable tokens; encoder="manual" means verbalizers are selected fixed tokens; encoder="lstm" refers to the P-Tuning method.
$ python run.py -h
usage: run.py [-h] [--encoder {manual,lstm,inner,inner2}] [--task TASK]
              [--num_splits NUM_SPLITS] [--repeat REPEAT] [--load_manual]
              [--extra_mask_rate EXTRA_MASK_RATE]
              [--output_dir_suffix OUTPUT_DIR_SUFFIX]

optional arguments:
  -h, --help            show this help message and exit
  --encoder {manual,lstm,inner,inner2}
  --task TASK
  --num_splits NUM_SPLITS
  --repeat REPEAT
  --load_manual
  --extra_mask_rate EXTRA_MASK_RATE
  --output_dir_suffix OUTPUT_DIR_SUFFIX, -o OUTPUT_DIR_SUFFIX
  • To train and evaluate on a single split with details recorded, use inference.py.
    • Before running, [task_name, label_list, prompt_type] should be configured in the code.
    • prompt_type="none" refers to fixed verbalizer training, while "inner" refers to the method proposed in the paper. ("inner2" is deprecated 2-stage training)
  • To find optimal hyper-parameters for each task-split and reproduce our result, please use sweep.py:
    • Please refer to documentation for WandB for more details.
$ python sweep.py -h
usage: sweep.py [-h]
                [--task {SST-2,sst-5,mr,cr,mpqa,subj,trec,CoLA,MNLI,MNLI-mm,SNLI,QNLI,RTE-glue,MRPC,QQP}]
                [--encoder {none,mlp,lstm,inner,inner2}]
                [--seed_split {13,21,42,87,100} [{13,21,42,87,100} ...]]
                [--batch_size {4,8,16,24,32} [{4,8,16,24,32} ...]]
                [--sweep_id SWEEP_ID]

optional arguments:
  -h, --help            show this help message and exit
  --task {SST-2,sst-5,mr,cr,mpqa,subj,trec,CoLA,MNLI,MNLI-mm,SNLI,QNLI,RTE-glue,MRPC,QQP}
  --encoder {none,mlp,lstm,inner,inner2}
  --seed_split {13,21,42,87,100} [{13,21,42,87,100} ...]
  --batch_size {4,8,16,24,32} [{4,8,16,24,32} ...]
  --sweep_id SWEEP_ID
  • To train and evaluate with more customized configurations, use cli.py.
  • To analyze and visualize the results come from inference.py, use visualize.py and visualize_word_emb.py.

How to Cite

@article{DBLP:journals/corr/abs-2108-13161,
  author    = {Ningyu Zhang and
               Luoqiu Li and
               Xiang Chen and
               Shumin Deng and
               Zhen Bi and
               Chuanqi Tan and
               Fei Huang and
               Huajun Chen},
  title     = {Differentiable Prompt Makes Pre-trained Language Models Better Few-shot
               Learners},
  journal   = {CoRR},
  volume    = {abs/2108.13161},
  year      = {2021},
  url       = {https://arxiv.org/abs/2108.13161},
  eprinttype = {arXiv},
  eprint    = {2108.13161},
  timestamp = {Thu, 13 Jan 2022 17:33:17 +0100},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2108-13161.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}
Owner
ZJUNLP
NLP Group of Knowledge Engine Lab at Zhejiang University
ZJUNLP
Airbus Ship Detection Challenge

Airbus Ship Detection Challenge This is an open solution to the Airbus Ship Detection Challenge. Our goals We are building entirely open solution to t

minerva.ml 55 Nov 29, 2022
PyTorch implementation of DeepLab v2 on COCO-Stuff / PASCAL VOC

DeepLab with PyTorch This is an unofficial PyTorch implementation of DeepLab v2 [1] with a ResNet-101 backbone. COCO-Stuff dataset [2] and PASCAL VOC

Kazuto Nakashima 995 Jan 08, 2023
Vision-Language Pre-training for Image Captioning and Question Answering

VLP This repo hosts the source code for our AAAI2020 work Vision-Language Pre-training (VLP). We have released the pre-trained model on Conceptual Cap

Luowei Zhou 373 Jan 03, 2023
Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows.

Swin-Transformer Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows. For more details, ple

旷视天元 MegEngine 9 Mar 14, 2022
PyTorch code for the paper "FIERY: Future Instance Segmentation in Bird's-Eye view from Surround Monocular Cameras"

FIERY This is the PyTorch implementation for inference and training of the future prediction bird's-eye view network as described in: FIERY: Future In

Wayve 406 Dec 24, 2022
Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Introduction Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper Song Park1

Clova AI Research 97 Dec 23, 2022
Code for the paper titled "Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks" (NeurIPS 2021 Spotlight).

Generalized Depthwise-Separable Convolutions for Adversarially Robust and Efficient Neural Networks This repository contains the code and pre-trained

Hassan Dbouk 7 Dec 05, 2022
Learning Saliency Propagation for Semi-supervised Instance Segmentation

Learning Saliency Propagation for Semi-supervised Instance Segmentation PyTorch Implementation This repository contains: the PyTorch implementation of

Berkeley DeepDrive 68 Oct 18, 2022
Embracing Single Stride 3D Object Detector with Sparse Transformer

SST: Single-stride Sparse Transformer This is the official implementation of paper: Embracing Single Stride 3D Object Detector with Sparse Transformer

TuSimple 385 Dec 28, 2022
Diverse Branch Block: Building a Convolution as an Inception-like Unit

Diverse Branch Block: Building a Convolution as an Inception-like Unit (PyTorch) (CVPR-2021) DBB is a powerful ConvNet building block to replace regul

253 Dec 24, 2022
ML-Decoder: Scalable and Versatile Classification Head

ML-Decoder: Scalable and Versatile Classification Head Paper Official PyTorch Implementation Tal Ridnik, Gilad Sharir, Avi Ben-Cohen, Emanuel Ben-Baru

189 Jan 04, 2023
OneShot Learning-based hotword detection.

EfficientWord-Net Hotword detection based on one-shot learning Home assistants require special phrases called hotwords to get activated (eg:"ok google

ANT-BRaiN 102 Dec 25, 2022
2021 CCF BDCI 全国信息检索挑战杯(CCIR-Cup)智能人机交互自然语言理解赛道第二名参赛解决方案

2021 CCF BDCI 全国信息检索挑战杯(CCIR-Cup) 智能人机交互自然语言理解赛道第二名解决方案 比赛网址: CCIR-Cup-智能人机交互自然语言理解 1.依赖环境: python==3.8 torch==1.7.1+cu110 numpy==1.19.2 transformers=

JinXiang 22 Oct 29, 2022
Extremely easy multi instancing software for minecraft speedrunning.

Easy Multi Extremely easy multi/single instancing software for minecraft speedrunning. A couple of goals of this project: Setup multi in minutes No fi

Duncan 8 Jul 16, 2022
Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language

Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language This repository contains the code, model, and deployment config

16 Oct 23, 2022
Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

Karan Desai 105 Nov 25, 2022
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. Website • Key Features • How To Use • Docs •

Pytorch Lightning 21.1k Jan 08, 2023
A NSFW content filter.

Project_Nfilter A NSFW content filter. With a motive of minimizing the spreads and leakage of NSFW contents on internet and access to others devices ,

1 Jan 20, 2022
Fast convergence of detr with spatially modulated co-attention

Fast convergence of detr with spatially modulated co-attention Usage There are no extra compiled components in SMCA DETR and package dependencies are

peng gao 135 Dec 07, 2022
A bare-bones Python library for quality diversity optimization.

pyribs Website Source PyPI Conda CI/CD Docs Docs Status Twitter pyribs.org GitHub docs.pyribs.org A bare-bones Python library for quality diversity op

ICAROS 127 Jan 06, 2023