NAACL'2021: Factual Probing Is [MASK]: Learning vs. Learning to Recall

Overview

OptiPrompt

This is the PyTorch implementation of the paper Factual Probing Is [MASK]: Learning vs. Learning to Recall.

We propose OptiPrompt, a simple and effective approach for Factual Probing. OptiPrompt optimizes the prompts on the input embedding space directly. It outperforms previous prompting methods on the LAMA benchmark. Furthermore, in order to better interprete probing results, we propose control experiments based on the probing results on randomly initialized models. Please check our paper for details.

Quick links

Setup

Install dependecies

Our code is based on python 3.7. All experiments are run on a single GPU.

Please install all the dependency packages using the following command:

pip install -r requirements.txt

Download the data

We pack all datasets we used in our experiments here. Please download it and extract the files to ./data, or run the following commands to autoamtically download and extract it.

bash script/download_data.sh

The datasets are structured as below.

data
├── LAMA-TREx                         # The original LAMA-TREx test set (34,039 examples)
│   ├── P17.jsonl                     # Testing file for the relation `P17`
│   └── ...
├── LAMA-TREx_UHN                     # The LAMA-TREx_UHN test set (27,102 examples)
│   ├── P17.jsonl                     # Testing file for the relation `P17`
│   └── ...
├── LAMA-TREx-easy-hard               # The easy and hard partitions of the LAMA-TREx dataset (check the paper for details)
│   ├── Easy                          # The LAMA-easy partition (10,546 examples)
│   │   ├── P17.jsonl                 # Testing file for the relation `P17`
│   │   └── ...
│   └── Hard                          # The LAMA-hard partition (23,493 examples)
│       ├── P17.jsonl                 # Testing file for the relation `P17`
│       └── ...
├── autoprompt_data                   # Training data collected by AutoPrompt
│   ├── P17                           # Train/dev/test files for the relation `P17`
│   │   ├── train.jsonl               # Training examples
│   │   ├── dev.jsonl                 # Development examples
│   │   └── test.jsonl                # Test examples (the same as LAMA-TREx test set)
│   └── ...
└── cmp_lms_data                      # Training data collected by ourselves which can be used for BERT, RoBERTa, and ALBERT (we only use this dataset in Table 6 in the paper)
    ├── P17                           # Train/dev/test files for the relation `P17`
    │   ├── train.jsonl               # Training examples
    │   ├── dev.jsonl                 # Development examples
    │   ├── test.jsonl                # Test examples (a subset of the LAMA-TREx test set, filtered using the common vocab of three models)
    └── ...

Run OptiPrompt

Train/evaluate OptiPrompt

You can use code/run_optiprompt.py to train or evaluate the prompts on a specific relation. A command template is as follow:

rel=P101
dir=outputs/${rel}
mkdir -p ${dir}

python code/run_optiprompt.py \
    --relation_profile relation_metainfo/LAMA_relations.jsonl \
    --relation ${rel} \
    --common_vocab_filename common_vocabs/common_vocab_cased.txt \
    --model_name bert-base-cased \
    --do_train \
    --train_data data/autoprompt_data/${rel}/train.jsonl \
    --dev_data data/autoprompt_data/${rel}/dev.jsonl \
    --do_eval \
    --test_data data/LAMA-TREx/${rel}.jsonl \
    --output_dir ${dir} \
    --random_init none \
    --output_predictions \
    [--init_manual_template] [--num_vectors 5 | 10]

Arguments:

  • relation_profile: the meta information for each relation, containing the manual templates.
  • relation: the relation type (e.g., P101) considered in this experiment.
  • common_vocab_filename: the vocabulary used to filter out facts; it should be the intersection of different models' for fair comparison.
  • model_name: the pre-trained model used in this experiment, e.g., bert-base-cased, albert-xxlarge-v1.
  • do_train: whether to train the prompts on a training and development set.
  • do_eval: whether to test the trained prompts on a testing set.
  • {train|dev|test}_data: the file path of training/development/testing dataset.
  • random_init: how do we random initialize the model before training, there are three settings:
    • none: use the pre-trained model, no random initialization is used;
    • embedding: the Rand E control setting, where we random initialize the embedding layer of the model;
    • all: the Rand M control setting, where we random initialize all the parameters of the model.
  • init_manual_template: whether initialize the dense vectors in OptiPrompt using the manual prompts.
  • num_vectors: how many dense vectors are added in OptiPrompt (this argument is valid only when init_manual_template is not set).
  • output_predictions: whether to output top-k predictions for each testing fact (k is specified by --k).

Run experiments on all relations

We provide an example script (scripts/run_optiprompt.sh) to run OptiPrompt on all 41 relations on the LAMA benchmark. Run the following command to use it:

bash scripts/run_opti.sh

The default setting of this script is to run OptiPromot initialized with manual prompts on the pre-trained bert-base-cased model (no random initialization is used). The results will be stored in the outputs directory.

Please modify the shell variables (i.e., OUTPUTS_DIR, MODEL, RAND) in scripts/run_optiprompt.sh if you want to run experiments on other settings.

Run Fine-tuning

We release the code that we used in our experiments (check Section 4 in the paper).

Fine-tuning language models on factual probing

You can use code/run_finetune.py to fine-tune a language model on a specific relation. A command template is as follow:

rel=P101
dir=outputs/${rel}
mkdir -p ${dir}

python code/run_finetune.py \
    --relation_profile relation_metainfo/LAMA_relations.jsonl \
    --relation ${rel} \
    --common_vocab_filename common_vocabs/common_vocab_cased.txt \
    --model_name bert-base-cased \
    --do_train \
    --train_data data/autoprompt_data/${rel}/train.jsonl \
    --dev_data data/autoprompt_data/${rel}/dev.jsonl \
    --do_eval \
    --test_data data/LAMA-TREx/${rel}.jsonl \
    --output_dir ${dir} \
    --random_init none \
    --output_predictions

Arguments:

  • relation_profile: the meta information for each relation, containing the manual templates.
  • relation: the relation type (e.g., P101) considered in this experiment.
  • common_vocab_filename: the vocabulary used to filter out facts; it should be the intersection of different models' for fair comparison.
  • model_name: the pre-trained model used in this experiment, e.g., bert-base-cased, albert-xxlarge-v1.
  • do_train: whether to train the prompts on a training and development set.
  • do_eval: whether to test the trained prompts on a testing set.
  • {train|dev|test}_data: the file path of training/development/testing dataset.
  • random_init: how do we random initialize the model before training, there are three settings:
    • none: use the pre-trained model, no random initialization is used;
    • embedding: the Rand E control setting, where we random initialize the embedding layer of the model;
    • all: the Rand M control setting, where we random initialize all the parameters of the model.
  • output_predictions: whether to output top-k predictions for each testing fact (k is specified by --k).

Run experiments on all relations

We provide an example script (scripts/run_finetune.sh) to run fine-tuning on all 41 relations on the LAMA benchmark. Run the following command to use it:

bash scripts/run_finetune.sh

Please modify the shell variables (i.e., OUTPUTS_DIR, MODEL, RAND) in scripts/run_finetune.sh if you want to run experiments on other settings.

Evaluate LAMA/LPAQA/AutoPrompt prompts

We provide a script to evaluate prompts released in previous works (based on code/run_finetune.py with only --do_eval). Please use the foolowing command:

bash scripts/run_eval_prompts {lama | lpaqa | autoprompt}

Questions?

If you have any questions related to the code or the paper, feel free to email Zexuan Zhong ([email protected]) or Dan Friedman ([email protected]). If you encounter any problems when using the code, or want to report a bug, you can open an issue. Please try to specify the problem with details so we can help you better and quicker!

Citation

@inproceedings{zhong2021factual,
   title={Factual Probing Is [MASK]: Learning vs. Learning to Recall},
   author={Zhong, Zexuan and Friedman, Dan and Chen, Danqi},
   booktitle={North American Association for Computational Linguistics (NAACL)},
   year={2021}
}
Owner
Princeton Natural Language Processing
Princeton Natural Language Processing
Implementation of Bagging and AdaBoost Algorithm

Bagging-and-AdaBoost Implementation of Bagging and AdaBoost Algorithm Dataset Red Wine Quality Data Sets For simplicity, we will have 2 classes of win

Zechen Ma 1 Nov 01, 2021
Pytorch implementation of

EfficientTTS Unofficial Pytorch implementation of "EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture"(arXiv). Disclaimer: Somebo

Liu Songxiang 109 Nov 16, 2022
🛠 All-in-one web-based IDE specialized for machine learning and data science.

All-in-one web-based development environment for machine learning Getting Started • Features & Screenshots • Support • Report a Bug • FAQ • Known Issu

Machine Learning Tooling 2.9k Jan 09, 2023
OpenGAN: Open-Set Recognition via Open Data Generation

OpenGAN: Open-Set Recognition via Open Data Generation ICCV 2021 (oral) Real-world machine learning systems need to analyze novel testing data that di

Shu Kong 90 Jan 06, 2023
Pytorch implementation of SELF-ATTENTIVE VAD, ICASSP 2021

SELF-ATTENTIVE VAD: CONTEXT-AWARE DETECTION OF VOICE FROM NOISE (ICASSP 2021) Pytorch implementation of SELF-ATTENTIVE VAD | Paper | Dataset Yong Rae

97 Dec 23, 2022
JAXMAPP: JAX-based Library for Multi-Agent Path Planning in Continuous Spaces

JAXMAPP: JAX-based Library for Multi-Agent Path Planning in Continuous Spaces JAXMAPP is a JAX-based library for multi-agent path planning (MAPP) in c

OMRON SINIC X 24 Dec 28, 2022
Python code for the paper How to scale hyperparameters for quickshift image segmentation

How to scale hyperparameters for quickshift image segmentation Python code for the paper How to scale hyperparameters for quickshift image segmentatio

0 Jan 25, 2022
YOLOX + ROS(1, 2) object detection package

YOLOX + ROS(1, 2) object detection package

Ar-Ray 158 Dec 21, 2022
Codebase for the paper titled "Continual learning with local module selection"

This repository contains the codebase for the paper Continual Learning via Local Module Composition. Setting up the environemnt Create a new conda env

Oleksiy Ostapenko 20 Dec 10, 2022
Personals scripts using ageitgey/face_recognition

HOW TO USE pip3 install requirements.txt Add some pictures of known people in the folder 'people' : a) Create a folder called by the name of the perso

Antoine Bollengier 1 Jan 06, 2022
Convert openmmlab (not only mmdetection) series model to tensorrt

MMDet to TensorRT This project aims to convert the mmdetection model to TensorRT model end2end. Focus on object detection for now. Mask support is exp

JinTian 4 Dec 17, 2021
Improving Generalization Bounds for VC Classes Using the Hypergeometric Tail Inversion

Improving Generalization Bounds for VC Classes Using the Hypergeometric Tail Inversion Preface This directory provides an implementation of the algori

Jean-Samuel Leboeuf 0 Nov 03, 2021
Implementation of "Debiasing Item-to-Item Recommendations With Small Annotated Datasets" (RecSys '20)

Debiasing Item-to-Item Recommendations With Small Annotated Datasets This is the code for our RecSys '20 paper. Other materials can be found here: Ful

Microsoft 34 Aug 10, 2022
Reverse engineer your pytorch vision models, in style

🔍 Rover Reverse engineer your CNNs, in style Rover will help you break down your CNN and visualize the features from within the model. No need to wri

Mayukh Deb 32 Sep 24, 2022
Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020)

Causality In Traffic Accident (Under Construction) Repository for Traffic Accident Benchmark for Causality Recognition (ECCV 2020) Overview Data Prepa

Tackgeun 21 Nov 20, 2022
【steal piano】GitHub偷情分析工具!

【steal piano】GitHub偷情分析工具! 你是否有这样的困扰,有一天你的仓库被很多人加了star,但是你却不知道这些人都是从哪来的? 别担心,GitHub偷情分析工具帮你轻松解决问题! 原理 GitHub偷情分析工具透过分析star的时间以及他们之间的follow关系,可以推测出每个st

黄巍 442 Dec 21, 2022
Convex optimization for fun and profit.

CFMM Optimal Routing This repository contains the code needed to generate the figures used in the paper Optimal Routing for Constant Function Market M

Guillermo Angeris 183 Dec 29, 2022
Pytorch Implementation of Various Point Transformers

Pytorch Implementation of Various Point Transformers Recently, various methods applied transformers to point clouds: PCT: Point Cloud Transformer (Men

Neil You 434 Dec 30, 2022
Volumetric Correspondence Networks for Optical Flow, NeurIPS 2019.

VCN: Volumetric correspondence networks for optical flow [project website] Requirements python 3.6 pytorch 1.1.0-1.3.0 pytorch correlation module (opt

Gengshan Yang 144 Dec 06, 2022
Self-Adaptable Point Processes with Nonparametric Time Decays

NPPDecay This is our implementation for the paper Self-Adaptable Point Processes with Nonparametric Time Decays, by Zhimeng Pan, Zheng Wang, Jeff M. P

zpan 2 Sep 24, 2022