The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

Overview

Overview

This is the code of our paper NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction. We use a sentence-level pre-training task NSP (Next Sentence Prediction) to realize prompt-learning and perform various downstream tasks, such as single sentence classification, sentence pair classification, coreference resolution, cloze-style task, entity linking, entity typing.

On the FewCLUE benchmark, our NSP-BERT outperforms other zero-shot methods (GPT-1-zero and PET-zero) on most of these tasks and comes close to the few-shot methods. We hope NSP-BERT can be an unsupervised tool that can assist other language tasks or models.

Guide

Section Description
Environment The required deployment environment
Downloads Download links for the models' checkpoints used by NSP-BERT
Use examples Learn to use NSP-BERT for different downstream tasks
Baselines Baseline results for several Chinese NLP datasets (partial)
Model Comparison Compare the models published in this repository
Strategy Details Some of the strategies used in the paper
Discussion Discussion and Discrimination for future work

Environment

The environments are as follows:

Python 3.6
bert4keras 0.10.6
tensorflow-gpu 1.15.0

Downloads

Models

We should dowmload the checkpoints of different models. The vocab.txt and the config.json are already in our repository.

Organization Model Name Model Parameters Download Linking Tips
Google BERT-Chinese L=12 H=769 A=12 102M Tensorflow
HFL BERT-wwm L=12 H=769 A=12 102M Tensorflow
BERT-wwm-ext L=12 H=769 A=12 102M Tensorflow
UER BERT-mixed-tiny L=3 H=384 A=6 14M Pytorch *
BERT-mixed-Small L=6 H=512 A=8 31M Pytorch *
BERT-mixed-Base L=12 H=769 A=12 102M Pytorch *
BERT-mixed-Large L=24 H=1024 A=16 327M Pytorch *

* We need to use UER's convert tool to convert UER pytorch to Original Tensorflow.

Datasets

We use FewCLUE datasets and DuEL2.0 (CCKS2020) in our experiments.

Datasets Download Links
FewCLUE https://github.com/CLUEbenchmark/FewCLUE/tree/main/datasets
DuEL2.0 (CCKS2020) https://aistudio.baidu.com/aistudio/competition/detail/83

Put the datasets into the NSP-BERT/datasets/.

Use examples

We can run individual python files in the project directly to evaluate our NSP-BERT.

NSP-BERT
    |- datasets
        |- clue_datasets
           |- ...
        |- DuEL 2.0
           |- dev.json
           |- kb.json
    |- models
        |- uer_mixed_corpus_bert_base
           |- bert_config.json
           |- vocab.txt
           |- bert_model.ckpt...
           |- ...
    |- nsp_bert_classification.py             # Single Sentence Classification
    |- nsp_bert_sentence_pair.py              # Sentence Pair Classification
    |- nsp_bert_cloze_style.py                # Cloze-style Task
    |- nsp_bert_coreference_resolution.py     # Coreference Resolution
    |- nsp_bert_entity_linking.py             # Entity Linking and Entity Typing
    |- utils.py
Python File Task Datasets
nsp_bert_classification.py Single Sentence Classification EPRSTMT, TNEWS, CSLDCP, IFLYTEK
nsp_bert_sentence_pair.py Sentence Pair Classification OCNLI, BUSTM, CSL
nsp_bert_cloze_style.py Cloze-style Task ChID
nsp_bert_coreference_resolution.py Coreference Resolution CLUEWSC
nsp_bert_entity_linking.py Entity Linking and Entity Typing DuEL2.0

Baselines

Reference FewCLUE, we choos 3 training scenarios, fine-tuning, few-shot and zero-shot. The baselines use Chineses-RoBERTa-Base and Chinses-GPT-1 as the backbone model.

Methods

Scenarios Methods
Fine-tuning BERT, RoBERTa
Few-Shot PET, ADAPET, P-tuning, LM-BFF, EFL
Zero-Shot GPT-zero, PET-zero

Downloads

Organization Model Name Model Parameters Download Linking
huawei-noah Chinese GPT L=12 H=769 A=12 102M Tensorflow
HFL RoBERTa-wwm-ext L=12 H=769 A=12 102M Tensorflow

Model Comparison


Main Results

Strategy Details


Strategies

Discussion

  • Sincce NSP-BERT is a sentence-level prompt-learning model, it is significantly superior to GPT-zero and PET-zero in terms of Single Sentence Classification tasks (TNEWS, CSLDCP and IFLYTEK). At the same time, it can solve the Entity Linking task (DuEL2.0), and the model is not limited by the non-fixed-length entity description, which GPT-zero and PET-zero cannot do this.
  • However, it doesn't work as well on Token-Level tasks, such as Cloze-style task and Entity Typing.
  • In future work, it is essential to extend NSP-BERT to the few-shot scenario.
Owner
Sun Yi
PhD student in computer science
Sun Yi
Decensoring Hentai with Deep Neural Networks. Formerly named DeepMindBreak.

DeepCreamPy Decensoring Hentai with Deep Neural Networks. Formerly named DeepMindBreak. A deep learning-based tool to automatically replace censored a

616 Jan 06, 2023
DeepCO3: Deep Instance Co-segmentation by Co-peak Search and Co-saliency

[CVPR19] DeepCO3: Deep Instance Co-segmentation by Co-peak Search and Co-saliency (Oral paper) Authors: Kuang-Jui Hsu, Yen-Yu Lin, Yung-Yu Chuang PDF:

Kuang-Jui Hsu 139 Dec 22, 2022
EEGEyeNet is benchmark to evaluate ET prediction based on EEG measurements with an increasing level of difficulty

Introduction EEGEyeNet EEGEyeNet is a benchmark to evaluate ET prediction based on EEG measurements with an increasing level of difficulty. Overview T

Ard Kastrati 23 Dec 22, 2022
High level network definitions with pre-trained weights in TensorFlow

TensorNets High level network definitions with pre-trained weights in TensorFlow (tested with 2.1.0 = TF = 1.4.0). Guiding principles Applicability.

Taehoon Lee 1k Dec 13, 2022
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification Created by Yongming Rao, Wenliang Zhao, Benlin Liu, Jiwen Lu, Jie Zhou, Ch

Yongming Rao 414 Jan 01, 2023
AI-generated-characters for Learning and Wellbeing

AI-generated-characters for Learning and Wellbeing Click here for the full project page. This repository contains the source code for the paper AI-gen

MIT Media Lab 214 Jan 01, 2023
Prototypical Networks for Few shot Learning in PyTorch

Prototypical Networks for Few shot Learning in PyTorch Simple alternative Implementation of Prototypical Networks for Few Shot Learning (paper, code)

Orobix 835 Jan 08, 2023
A comprehensive list of published machine learning applications to cosmology

ml-in-cosmology This github attempts to maintain a comprehensive list of published machine learning applications to cosmology, organized by subject ma

George Stein 290 Dec 29, 2022
Pre-training of Graph Augmented Transformers for Medication Recommendation

G-Bert Pre-training of Graph Augmented Transformers for Medication Recommendation Intro G-Bert combined the power of Graph Neural Networks and BERT (B

101 Dec 27, 2022
A PyTorch-based library for fast prototyping and sharing of deep neural network models.

A PyTorch-based library for fast prototyping and sharing of deep neural network models.

78 Jan 03, 2023
PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

Thalles Silva 1.7k Dec 28, 2022
🔥 Real-time Super Resolution enhancement (4x) with content loss and relativistic adversarial optimization 🔥

🔥 Real-time Super Resolution enhancement (4x) with content loss and relativistic adversarial optimization 🔥

Rishik Mourya 48 Dec 20, 2022
"Neural Turing Machine" in Tensorflow

Neural Turing Machine in Tensorflow Tensorflow implementation of Neural Turing Machine. This implementation uses an LSTM controller. NTM models with m

Taehoon Kim 1k Dec 06, 2022
SurfEmb (CVPR 2022) - SurfEmb: Dense and Continuous Correspondence Distributions

SurfEmb SurfEmb: Dense and Continuous Correspondence Distributions for Object Pose Estimation with Learnt Surface Embeddings Rasmus Laurvig Haugard, A

Rasmus Haugaard 56 Nov 19, 2022
This project hosts the code for implementing the ISAL algorithm for object detection and image classification

Influence Selection for Active Learning (ISAL) This project hosts the code for implementing the ISAL algorithm for object detection and image classifi

25 Sep 11, 2022
The pure and clear PyTorch Distributed Training Framework.

The pure and clear PyTorch Distributed Training Framework. Introduction Requirements and Usage Dependency Dataset Basic Usage Slurm Cluster Usage Base

WILL LEE 208 Dec 20, 2022
Implementation of Bagging and AdaBoost Algorithm

Bagging-and-AdaBoost Implementation of Bagging and AdaBoost Algorithm Dataset Red Wine Quality Data Sets For simplicity, we will have 2 classes of win

Zechen Ma 1 Nov 01, 2021
Official Pytorch implementation of 'GOCor: Bringing Globally Optimized Correspondence Volumes into Your Neural Network' (NeurIPS 2020)

Official implementation of GOCor This is the official implementation of our paper : GOCor: Bringing Globally Optimized Correspondence Volumes into You

Prune Truong 71 Nov 18, 2022
Artificial Intelligence playing minesweeper 🤖

AI playing Minesweeper ✨ Minesweeper is a single-player puzzle video game. The objective of the game is to clear a rectangular board containing hidden

Vaibhaw 8 Oct 17, 2022
The MLOps platform for innovators 🚀

​ DS2.ai is an integrated AI operation solution that supports all stages from custom AI development to deployment. It is an AI-specialized platform service that collects data, builds a training datas

9 Jan 03, 2023