p-tuning for few-shot NLU task

Overview

p-tuning_NLU

Overview

这个小项目是受乐于分享的苏剑林大佬这篇p-tuning 文章启发,也实现了个使用P-tuning进行NLU分类的任务, 思路是一样的,prompt实现方式有不同,这里是将[unused*]的embeddings参数抽取出用于初始化prompt_embed后,再接一个lstm和mlp用于关联各prompt, 与最初p-tuning提出《GPT Understands, Too》 的实现一样,结果显示在few-shot上p-tuning非常接近finetune效果。

Dataset

数据是情感分类,下载地址百度网盘 提取码:osja

Evaluation

1. finetune

python few_shot_finetune.py

测试集效果:

epoch: 0 - acc: 0.897679 - best_test_acc: 0.8976788252013264
epoch: 1 - acc: 0.876362 - best_test_acc: 0.8976788252013264
epoch: 2 - acc: 0.884889 - best_test_acc: 0.8976788252013264
epoch: 3 - acc: 0.884415 - best_test_acc: 0.8976788252013264
epoch: 4 - acc: 0.884415 - best_test_acc: 0.8976788252013264

全量参数对小样本进行finetune,仅1个epoch就收敛了

2. p-tuning

python few_shot_ptuning.py

测试集效果:

epoch: 0 - acc: 0.546660 - best_test_acc: 0.5466603505447655
epoch: 1 - acc: 0.687826 - best_test_acc: 0.6878256750355282
epoch: 2 - acc: 0.737091 - best_test_acc: 0.7370914258645191
epoch: 3 - acc: 0.722406 - best_test_acc: 0.7370914258645191
epoch: 4 - acc: 0.776883 - best_test_acc: 0.7768829938417812
epoch: 5 - acc: 0.805306 - best_test_acc: 0.8053055423969683
epoch: 6 - acc: 0.833254 - best_test_acc: 0.8332543818095689
epoch: 7 - acc: 0.837991 - best_test_acc: 0.8379914732354334
epoch: 8 - acc: 0.854571 - best_test_acc: 0.8545712932259593
epoch: 9 - acc: 0.858361 - best_test_acc: 0.8583609663666508
epoch: 10 - acc: 0.856466 - best_test_acc: 0.8583609663666508
epoch: 11 - acc: 0.853150 - best_test_acc: 0.8583609663666508
epoch: 12 - acc: 0.868783 - best_test_acc: 0.8687825675035529
epoch: 13 - acc: 0.877309 - best_test_acc: 0.877309332070109
epoch: 14 - acc: 0.873993 - best_test_acc: 0.877309332070109
epoch: 15 - acc: 0.877783 - best_test_acc: 0.8777830412126955
epoch: 16 - acc: 0.882994 - best_test_acc: 0.8829938417811464
epoch: 17 - acc: 0.881573 - best_test_acc: 0.8829938417811464
epoch: 18 - acc: 0.889626 - best_test_acc: 0.8896257697773567
epoch: 19 - acc: 0.877783 - best_test_acc: 0.8896257697773567

仅prompt_embed和lstm及mlp去做p-tuning,20个epoch后接近收敛,acc=0.8896,略小于finetun的acc 0.8977

附上苏神结果对比:

img

2021语言与智能技术竞赛:机器阅读理解任务

LICS2021 MRC 1. 项目&任务介绍 本项目基于官方给定的baseline(DuReader-Checklist-BASELINE)进行二次改造,对整个代码框架做了简单的重构,对核心网络结构添加了注释,解耦了数据读取的模块,并添加了阈值确认的功能,一些小的细节也做了改进。 本次任务为202

roar 29 Dec 05, 2022
Associated Repository for "Translation between Molecules and Natural Language"

MolT5: Translation between Molecules and Natural Language Associated repository for "Translation between Molecules and Natural Language". Table of Con

67 Dec 15, 2022
🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.

In recent years, the dense retrievers based on pre-trained language models have achieved remarkable progress. To facilitate more developers using cutt

475 Jan 04, 2023
Python implementation of TextRank for phrase extraction and summarization of text documents

PyTextRank PyTextRank is a Python implementation of TextRank as a spaCy pipeline extension, used to: extract the top-ranked phrases from text document

derwen.ai 1.9k Jan 06, 2023
L3Cube-MahaCorpus a Marathi monolingual data set scraped from different internet sources.

L3Cube-MahaCorpus L3Cube-MahaCorpus a Marathi monolingual data set scraped from different internet sources. We expand the existing Marathi monolingual

21 Dec 17, 2022
🌐 Translation microservice powered by AI

Dot Translate 🌐 A microservice for quick and local translation using A.I. This service starts a local webserver used for neural machine translation.

Dot HQ 48 Nov 22, 2022
Neural text generators like the GPT models promise a general-purpose means of manipulating texts.

Boolean Prompting for Neural Text Generators Neural text generators like the GPT models promise a general-purpose means of manipulating texts. These m

Jeffrey M. Binder 20 Jan 09, 2023
NLP Core Library and Model Zoo based on PaddlePaddle 2.0

PaddleNLP 2.0拥有丰富的模型库、简洁易用的API与高性能的分布式训练的能力,旨在为飞桨开发者提升文本建模效率,并提供基于PaddlePaddle 2.0的NLP领域最佳实践。

6.9k Jan 01, 2023
PIZZA - a task-oriented semantic parsing dataset

The PIZZA dataset continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, whose semantics cannot be captured by flat slots and intents.

17 Dec 14, 2022
Pangu-Alpha for Transformers

Pangu-Alpha for Transformers Usage Download MindSpore FP32 weights for GPU from here to data/Pangu-alpha_2.6B.ckpt Activate MindSpore environment and

One 5 Oct 01, 2022
Repository for the paper: VoiceMe: Personalized voice generation in TTS

🗣 VoiceMe: Personalized voice generation in TTS Abstract Novel text-to-speech systems can generate entirely new voices that were not seen during trai

Pol van Rijn 80 Dec 29, 2022
Text Classification Using LSTM

Text classification is the task of assigning a set of predefined categories to free text. Text classifiers can be used to organize, structure, and categorize pretty much anything. For example, new ar

KrishArul26 3 Jan 03, 2023
Natural language Understanding Toolkit

Natural language Understanding Toolkit TOC Requirements Installation Documentation CLSCL NER References Requirements To install nut you need: Python 2

Peter Prettenhofer 119 Oct 08, 2022
Sequence modeling benchmarks and temporal convolutional networks

Sequence Modeling Benchmarks and Temporal Convolutional Networks (TCN) This repository contains the experiments done in the work An Empirical Evaluati

CMU Locus Lab 3.5k Jan 03, 2023
Mkdocs + material + cool stuff

Modern-Python-Doc-Example mkdocs + material + cool stuff Doc is live here Features out of the box amazing good looking website thanks to mkdocs.org an

Francesco Saverio Zuppichini 61 Oct 26, 2022
Twitter bot that uses NLP models to summarize news articles referenced in a user's twitter timeline

Twitter-News-Summarizer Twitter bot that uses NLP models to summarize news articles referenced in a user's twitter timeline 1.) Extracts all tweets fr

Rohit Govindan 1 Jan 27, 2022
The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

Unsupervised technique to Glossary and Definition Extraction Code Files GPT2-DefinitionModel.ipynb - GPT-2 model for definition generation. Data_Gener

Prakhar Mishra 28 May 25, 2021
SummerTime - Text Summarization Toolkit for Non-experts

A library to help users choose appropriate summarization tools based on their specific tasks or needs. Includes models, evaluation metrics, and datasets.

Yale-LILY 213 Jan 04, 2023
Predicting the usefulness of reviews given the review text and metadata surrounding the reviews.

Predicting Yelp Review Quality Table of Contents Introduction Motivation Goal and Central Questions The Data Data Storage and ETL EDA Data Pipeline Da

Jeff Johannsen 3 Nov 27, 2022
LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

LSTC: Boosting Atomic Action Detection with Long-Short-Term Context This Repository contains the code on AVA of our ACM MM 2021 paper: LSTC: Boosting

Tencent YouTu Research 9 Oct 11, 2022