Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

Last update: Dec 30, 2022

Related tags

Overview

PTR

Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

If you use the code, please cite the following paper:

@article{han2021ptr,
  title={PTR: Prompt Tuning with Rules for Text Classification},
  author={Han, Xu and Zhao, Weilin and Ding, Ning and Liu, Zhiyuan and Sun, Maosong},
  journal={arXiv preprint arXiv:2105.11259},
  year={2021}
}

Requirements

The model is implemented using PyTorch. The versions of packages used are shown below.

numpy>=1.18.0
scikit-learn>=0.22.1
scipy>=1.4.1
torch>=1.3.0
tqdm>=4.41.1
transformers>=4.0.0

Baselines

Some baselines, especially the baselines using entity markers, come from the project [RE_improved_baseline].

Datasets

We provide all the datasets and prompts used in our experiments.

Run the experiments

(1) For TACRED

mkdir results
cd results
mkdir tacred
cd tacred
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_tacred.sh

(2) For TACREV

mkdir results
cd results
mkdir tacrev
cd tacrev
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_tacrev.sh

(3) For RETACRED

mkdir results
cd results
mkdir retacred
cd retacred
mkdir train
mkdir val
mkdir test
cd ..
cd ..
cd code_script
bash run_large_retacred.sh

Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

Related tags

Overview

PTR

Requirements

Baselines

Datasets

Run the experiments

(1) For TACRED

(2) For TACREV

(3) For RETACRED

Owner

THUNLP

Code for Discovering Topics in Long-tailed Corpora with Causal Intervention.

TalkNet: Audio-visual active speaker detection Model

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

This is the source code of RPG (Reward-Randomized Policy Gradient)

APEACH: Attacking Pejorative Expressions with Analysis on Crowd-generated Hate Speech Evaluation Datasets

Creating a Feed of MISP Events from ThreatFox (by abuse.ch)

ACL'2021: Learning Dense Representations of Phrases at Scale

Code for "Generating Disentangled Arguments with Prompts: a Simple Event Extraction Framework that Works"

BERT score for text generation

Faster, modernized fork of the language identification tool langid.py

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

Korean Simple Contrastive Learning of Sentence Embeddings using SKT KoBERT and kakaobrain KorNLU dataset

Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch

STT for TorchScript is a port of Coqui STT based on DeepSpeech to PyTorch.

Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx

Pipelines de datos, 2021.

Code for the paper "BERT Loses Patience: Fast and Robust Inference with Early Exit".

A text file containing 479k English words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion

End-to-end image captioning with EfficientNet-b3 + LSTM with Attention

A Python script that compares files in directories