EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Last update: Nov 03, 2022

Related tags

Deep Learning Meta-tuning

Overview

Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Ruiqi Zhong, Kristy Lee^*, Zheng Zhang^*, Dan Klein

EMNLP 2021 Findings, https://arxiv.org/abs/2104.04670

Data

Please download the dataset from here: https://drive.google.com/file/d/1hrLlpk6Pla95Bnv_e1MAhCx7uJSDgA-w/view?usp=sharing

If you are using this dataset, please cite all the papers in the custom_citations.txt, anthology_citations.txt, urls.txt file in the citations folder. Thanks!

Each datapoint is represented as a dictionary.

{"q": [label description], "c": [text input], "a": [0 or 1]},

where "q" stands for question, which contains label information, "c" stands for context, which contains the input text, "a" stands for answer, which is either 1 (Yes) or 0 (No).

training_dicts/ contains all the datasets for training, and each of the .pkl file is a list of datapoints. testing_dicts/ contains all the datasets for evaluation, and each of the .pkl file is a map from (label, label descriptions) to a list of datapoints.

Datasets that have the same group number in front of their filenames are considered similar. Notice that, for each dataset, there might be overlapping datapoints between the training and testing split, but it is okay since we never train and test on the same dataset.

Additionally, to speedup evaluation, we performed subsampling for many of the test datasets, so the numbers will not be directly comparable to those in the other paper.

Specialized Models are Better

Meta-tune a model that is initialized with T5-large and test it on unseen (non-similar) datasets

python3 default_train.py large

Test UnifiedQA on all datatsets used for evaluation

python3 baseline.py large

Evaluate and compare the meta-tuned model and the UnifiedQA baseline with AUC-ROC for each label description.

python3 evaluate_and_plot.py large

We should expect to see that meta-tuned model is better than the UnifiedQA model on the majority of label descriptions.

Larger Models are Better

We can train another smaller-sized model using the command

python3 default_train.py base

and then we can compare the large vs. base model modifying evaluate_and_plot.py.

EMNLP 2021 Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Related tags

Overview

Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections

Data

Specialized Models are Better

Larger Models are Better

Owner

Ruiqi Zhong

Research on Tabular Deep Learning (Python package & papers)

Compare neural networks by their feature similarity

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Extracts essential Mediapipe face landmarks and arranges them in a sequenced order.

Perfect implement. Model shared. x0.5 (Top1:60.646) and 1.0x (Top1:69.402).

Python implementation of O-OFDMNet, a deep learning-based optical OFDM system,

A python toolbox for predictive uncertainty quantification, calibration, metrics, and visualization

Application of K-means algorithm on a music dataset after a dimensionality reduction with PCA

Bayesian Inference Tools in Python

[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

这是一个facenet-pytorch的库，可以用于训练自己的人脸识别模型。

This project deals with the detection of skin lesions within the ISICs dataset using YOLOv3 Object Detection with Darknet.

Generate text captions for images from their CLIP embeddings. Includes PyTorch model code and example training script.

A tool for calculating distortion parameters in coordination complexes.

Several simple examples for popular neural network toolkits calling custom CUDA operators.

Official repository of IMPROVING DEEP IMAGE MATTING VIA LOCAL SMOOTHNESS ASSUMPTION.

Narya API allows you track soccer player from camera inputs, and evaluate them with an Expected Discounted Goal (EDG) Agent

Time should be taken seer-iously

AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition