The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

Overview

Overview

This is the code of our paper NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction. We use a sentence-level pre-training task NSP (Next Sentence Prediction) to realize prompt-learning and perform various downstream tasks, such as single sentence classification, sentence pair classification, coreference resolution, cloze-style task, entity linking, entity typing.

On the FewCLUE benchmark, our NSP-BERT outperforms other zero-shot methods (GPT-1-zero and PET-zero) on most of these tasks and comes close to the few-shot methods. We hope NSP-BERT can be an unsupervised tool that can assist other language tasks or models.

Guide

Section Description
Environment The required deployment environment
Downloads Download links for the models' checkpoints used by NSP-BERT
Use examples Learn to use NSP-BERT for different downstream tasks
Baselines Baseline results for several Chinese NLP datasets (partial)
Model Comparison Compare the models published in this repository
Strategy Details Some of the strategies used in the paper
Discussion Discussion and Discrimination for future work

Environment

The environments are as follows:

Python 3.6
bert4keras 0.10.6
tensorflow-gpu 1.15.0

Downloads

Models

We should dowmload the checkpoints of different models. The vocab.txt and the config.json are already in our repository.

Organization Model Name Model Parameters Download Linking Tips
Google BERT-Chinese L=12 H=769 A=12 102M Tensorflow
HFL BERT-wwm L=12 H=769 A=12 102M Tensorflow
BERT-wwm-ext L=12 H=769 A=12 102M Tensorflow
UER BERT-mixed-tiny L=3 H=384 A=6 14M Pytorch *
BERT-mixed-Small L=6 H=512 A=8 31M Pytorch *
BERT-mixed-Base L=12 H=769 A=12 102M Pytorch *
BERT-mixed-Large L=24 H=1024 A=16 327M Pytorch *

* We need to use UER's convert tool to convert UER pytorch to Original Tensorflow.

Datasets

We use FewCLUE datasets and DuEL2.0 (CCKS2020) in our experiments.

Datasets Download Links
FewCLUE https://github.com/CLUEbenchmark/FewCLUE/tree/main/datasets
DuEL2.0 (CCKS2020) https://aistudio.baidu.com/aistudio/competition/detail/83

Put the datasets into the NSP-BERT/datasets/.

Use examples

We can run individual python files in the project directly to evaluate our NSP-BERT.

NSP-BERT
    |- datasets
        |- clue_datasets
           |- ...
        |- DuEL 2.0
           |- dev.json
           |- kb.json
    |- models
        |- uer_mixed_corpus_bert_base
           |- bert_config.json
           |- vocab.txt
           |- bert_model.ckpt...
           |- ...
    |- nsp_bert_classification.py             # Single Sentence Classification
    |- nsp_bert_sentence_pair.py              # Sentence Pair Classification
    |- nsp_bert_cloze_style.py                # Cloze-style Task
    |- nsp_bert_coreference_resolution.py     # Coreference Resolution
    |- nsp_bert_entity_linking.py             # Entity Linking and Entity Typing
    |- utils.py
Python File Task Datasets
nsp_bert_classification.py Single Sentence Classification EPRSTMT, TNEWS, CSLDCP, IFLYTEK
nsp_bert_sentence_pair.py Sentence Pair Classification OCNLI, BUSTM, CSL
nsp_bert_cloze_style.py Cloze-style Task ChID
nsp_bert_coreference_resolution.py Coreference Resolution CLUEWSC
nsp_bert_entity_linking.py Entity Linking and Entity Typing DuEL2.0

Baselines

Reference FewCLUE, we choos 3 training scenarios, fine-tuning, few-shot and zero-shot. The baselines use Chineses-RoBERTa-Base and Chinses-GPT-1 as the backbone model.

Methods

Scenarios Methods
Fine-tuning BERT, RoBERTa
Few-Shot PET, ADAPET, P-tuning, LM-BFF, EFL
Zero-Shot GPT-zero, PET-zero

Downloads

Organization Model Name Model Parameters Download Linking
huawei-noah Chinese GPT L=12 H=769 A=12 102M Tensorflow
HFL RoBERTa-wwm-ext L=12 H=769 A=12 102M Tensorflow

Model Comparison


Main Results

Strategy Details


Strategies

Discussion

  • Sincce NSP-BERT is a sentence-level prompt-learning model, it is significantly superior to GPT-zero and PET-zero in terms of Single Sentence Classification tasks (TNEWS, CSLDCP and IFLYTEK). At the same time, it can solve the Entity Linking task (DuEL2.0), and the model is not limited by the non-fixed-length entity description, which GPT-zero and PET-zero cannot do this.
  • However, it doesn't work as well on Token-Level tasks, such as Cloze-style task and Entity Typing.
  • In future work, it is essential to extend NSP-BERT to the few-shot scenario.
Owner
Sun Yi
PhD student in computer science
Sun Yi
Codebase of deep learning models for inferring stability of mRNA molecules

Kaggle OpenVaccine Models Codebase of deep learning models for inferring stability of mRNA molecules, corresponding to the Kaggle Open Vaccine Challen

Eternagame 40 Dec 29, 2022
Self-supervised learning on Graph Representation Learning (node-level task)

graph_SSL Self-supervised learning on Graph Representation Learning (node-level task) How to run the code To run GRACE, sh run_GRACE.sh To run GCA, sh

Namkyeong Lee 3 Dec 31, 2021
CSE-519---Project - Job Title Analysis (Project for CSE 519 - Data Science Fundamentals)

A Multifaceted Approach to Job Title Analysis CSE 519 - Data Science Fundamentals Project Description Project consists of three parts: Salary Predicti

Jimit Dholakia 1 Jan 04, 2022
Code for "Discovering Non-monotonic Autoregressive Orderings with Variational Inference" (paper and code updated from ICLR 2021)

Discovering Non-monotonic Autoregressive Orderings with Variational Inference Description This package contains the source code implementation of the

Xuanlin (Simon) Li 10 Dec 29, 2022
A library for Deep Learning Implementations and utils

deeply A Deep Learning library Table of Contents Features Quick Start Usage License Features Python 2.7+ and Python 3.4+ compatible. Quick Start $ pip

Achilles Rasquinha 1 Dec 12, 2022
FedTorch is an open-source Python package for distributed and federated training of machine learning models using PyTorch distributed API

FedTorch is a generic repository for benchmarking different federated and distributed learning algorithms using PyTorch Distributed API.

Machine Learning and Optimization Lab @PennState 136 Dec 23, 2022
Advantage Actor Critic (A2C): jax + flax implementation

Advantage Actor Critic (A2C): jax + flax implementation Current version supports only environments with continious action spaces and was tested on muj

Andrey 3 Jan 23, 2022
Epidemiology analysis package

zEpid zEpid is an epidemiology analysis package, providing easy to use tools for epidemiologists coding in Python 3.5+. The purpose of this library is

Paul Zivich 111 Jan 08, 2023
TensorFlow implementation of Deep Reinforcement Learning papers

Deep Reinforcement Learning in TensorFlow TensorFlow implementation of Deep Reinforcement Learning papers. This implementation contains: [1] Playing A

Taehoon Kim 1.6k Jan 03, 2023
Stacked Generative Adversarial Networks

Stacked Generative Adversarial Networks This repository contains code for the paper "Stacked Generative Adversarial Networks", CVPR 2017. Part of the

Xun Huang 241 May 07, 2022
particle tracking model, works with the ROMS output file(qck.nc, his.nc)

particle-tracking-model-for-ROMS particle tracking model, works with the ROMS output file(qck.nc, his.nc) description this is a 2-dimensional particle

xusheng 1 Jan 11, 2022
Half Instance Normalization Network for Image Restoration

HINet Half Instance Normalization Network for Image Restoration, based on https://github.com/megvii-model/HINet. Dependencies NumPy PyTorch, preferabl

Holy Wu 4 Jun 06, 2022
An investigation project for SISR.

SISR-Survey An investigation project for SISR. This repository is an official project of the paper "From Beginner to Master: A Survey for Deep Learnin

Juncheng Li 79 Oct 20, 2022
Reinforcement Learning via Supervised Learning

Reinforcement Learning via Supervised Learning Installation Run pip install -e . in an environment with Python = 3.7.0, 3.9. The code depends on MuJ

Scott Emmons 49 Nov 28, 2022
Tweesent-back - Tweesent backend uses fastAPI as the web framework

TweeSent Backend Tweesent backend. This repo uses fastAPI as the web framework.

0 Mar 26, 2022
Semi Supervised Learning for Medical Image Segmentation, a collection of literature reviews and code implementations.

Semi-supervised-learning-for-medical-image-segmentation. Recently, semi-supervised image segmentation has become a hot topic in medical image computin

Healthcare Intelligence Laboratory 1.3k Jan 03, 2023
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 87 Jan 08, 2023
[SIGGRAPH 2020] Attribute2Font: Creating Fonts You Want From Attributes

Attr2Font Introduction This is the official PyTorch implementation of the Attribute2Font: Creating Fonts You Want From Attributes. Paper: arXiv | Rese

Yue Gao 200 Dec 15, 2022
Taking A Closer Look at Domain Shift: Category-level Adversaries for Semantics Consistent Domain Adaptation

Taking A Closer Look at Domain Shift: Category-level Adversaries for Semantics Consistent Domain Adaptation (CVPR2019) This is a pytorch implementatio

Yawei Luo 280 Jan 01, 2023
RMTD: Robust Moving Target Defence Against False Data Injection Attacks in Power Grids

RMTD: Robust Moving Target Defence Against False Data Injection Attacks in Power Grids Real-time detection performance. This repo contains the code an

0 Nov 10, 2021