The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

Overview

Overview

This is the code of our paper NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction. We use a sentence-level pre-training task NSP (Next Sentence Prediction) to realize prompt-learning and perform various downstream tasks, such as single sentence classification, sentence pair classification, coreference resolution, cloze-style task, entity linking, entity typing.

On the FewCLUE benchmark, our NSP-BERT outperforms other zero-shot methods (GPT-1-zero and PET-zero) on most of these tasks and comes close to the few-shot methods. We hope NSP-BERT can be an unsupervised tool that can assist other language tasks or models.

Guide

Section Description
Environment The required deployment environment
Downloads Download links for the models' checkpoints used by NSP-BERT
Use examples Learn to use NSP-BERT for different downstream tasks
Baselines Baseline results for several Chinese NLP datasets (partial)
Model Comparison Compare the models published in this repository
Strategy Details Some of the strategies used in the paper
Discussion Discussion and Discrimination for future work

Environment

The environments are as follows:

Python 3.6
bert4keras 0.10.6
tensorflow-gpu 1.15.0

Downloads

Models

We should dowmload the checkpoints of different models. The vocab.txt and the config.json are already in our repository.

Organization Model Name Model Parameters Download Linking Tips
Google BERT-Chinese L=12 H=769 A=12 102M Tensorflow
HFL BERT-wwm L=12 H=769 A=12 102M Tensorflow
BERT-wwm-ext L=12 H=769 A=12 102M Tensorflow
UER BERT-mixed-tiny L=3 H=384 A=6 14M Pytorch *
BERT-mixed-Small L=6 H=512 A=8 31M Pytorch *
BERT-mixed-Base L=12 H=769 A=12 102M Pytorch *
BERT-mixed-Large L=24 H=1024 A=16 327M Pytorch *

* We need to use UER's convert tool to convert UER pytorch to Original Tensorflow.

Datasets

We use FewCLUE datasets and DuEL2.0 (CCKS2020) in our experiments.

Datasets Download Links
FewCLUE https://github.com/CLUEbenchmark/FewCLUE/tree/main/datasets
DuEL2.0 (CCKS2020) https://aistudio.baidu.com/aistudio/competition/detail/83

Put the datasets into the NSP-BERT/datasets/.

Use examples

We can run individual python files in the project directly to evaluate our NSP-BERT.

NSP-BERT
    |- datasets
        |- clue_datasets
           |- ...
        |- DuEL 2.0
           |- dev.json
           |- kb.json
    |- models
        |- uer_mixed_corpus_bert_base
           |- bert_config.json
           |- vocab.txt
           |- bert_model.ckpt...
           |- ...
    |- nsp_bert_classification.py             # Single Sentence Classification
    |- nsp_bert_sentence_pair.py              # Sentence Pair Classification
    |- nsp_bert_cloze_style.py                # Cloze-style Task
    |- nsp_bert_coreference_resolution.py     # Coreference Resolution
    |- nsp_bert_entity_linking.py             # Entity Linking and Entity Typing
    |- utils.py
Python File Task Datasets
nsp_bert_classification.py Single Sentence Classification EPRSTMT, TNEWS, CSLDCP, IFLYTEK
nsp_bert_sentence_pair.py Sentence Pair Classification OCNLI, BUSTM, CSL
nsp_bert_cloze_style.py Cloze-style Task ChID
nsp_bert_coreference_resolution.py Coreference Resolution CLUEWSC
nsp_bert_entity_linking.py Entity Linking and Entity Typing DuEL2.0

Baselines

Reference FewCLUE, we choos 3 training scenarios, fine-tuning, few-shot and zero-shot. The baselines use Chineses-RoBERTa-Base and Chinses-GPT-1 as the backbone model.

Methods

Scenarios Methods
Fine-tuning BERT, RoBERTa
Few-Shot PET, ADAPET, P-tuning, LM-BFF, EFL
Zero-Shot GPT-zero, PET-zero

Downloads

Organization Model Name Model Parameters Download Linking
huawei-noah Chinese GPT L=12 H=769 A=12 102M Tensorflow
HFL RoBERTa-wwm-ext L=12 H=769 A=12 102M Tensorflow

Model Comparison


Main Results

Strategy Details


Strategies

Discussion

  • Sincce NSP-BERT is a sentence-level prompt-learning model, it is significantly superior to GPT-zero and PET-zero in terms of Single Sentence Classification tasks (TNEWS, CSLDCP and IFLYTEK). At the same time, it can solve the Entity Linking task (DuEL2.0), and the model is not limited by the non-fixed-length entity description, which GPT-zero and PET-zero cannot do this.
  • However, it doesn't work as well on Token-Level tasks, such as Cloze-style task and Entity Typing.
  • In future work, it is essential to extend NSP-BERT to the few-shot scenario.
Owner
Sun Yi
PhD student in computer science
Sun Yi
SegNet-Basic with Keras

SegNet-Basic: What is Segnet? Deep Convolutional Encoder-Decoder Architecture for Semantic Pixel-wise Image Segmentation Segnet = (Encoder + Decoder)

Yad Konrad 81 Jun 30, 2022
ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton (AAAI'22)

ShuttleNet: Position-aware Rally Progress and Player Styles Fusion for Stroke Forecasting in Badminton (AAAI 2022) Official code of the paper ShuttleN

Wei-Yao Wang 11 Nov 30, 2022
QilingLab challenge writeup

qiling lab writeup shielder 在 2021/7/21 發布了 QilingLab 來幫助學習 qiling framwork 的用法,剛好最近有用到,順手解了一下並寫了一下 writeup。 前情提要 Qiling 是一款功能強大的模擬框架,和 qemu user mode

Yuan 17 Nov 17, 2022
Computing Shapley values using VAEAC

Shapley values and the VAEAC method In this GitHub repository, we present the implementation of the VAEAC approach from our paper "Using Shapley Value

3 Nov 23, 2022
This repo is customed for VisDrone.

Object Detection for VisDrone(无人机航拍图像目标检测) My environment 1、Windows10 (Linux available) 2、tensorflow = 1.12.0 3、python3.6 (anaconda) 4、cv2 5、ensemble

53 Jul 17, 2022
Self-supervised Deep LiDAR Odometry for Robotic Applications

DeLORA: Self-supervised Deep LiDAR Odometry for Robotic Applications Overview Paper: link Video: link ICRA Presentation: link This is the correspondin

Robotic Systems Lab - Legged Robotics at ETH Zürich 181 Dec 29, 2022
A Marvelous ChatBot implement using PyTorch.

PyTorch Marvelous ChatBot [Update] it's 2019 now, previously model can not catch up state-of-art now. So we just move towards the future a transformer

JinTian 223 Oct 18, 2022
This is a file about Unet implemented in Pytorch

Unet this is an implemetion of Unet in Pytorch and it's architecture is as follows which is the same with paper of Unet component of Unet Convolution

Dragon 1 Dec 03, 2021
Awesome Graph Classification - A collection of important graph embedding, classification and representation learning papers with implementations.

A collection of graph classification methods, covering embedding, deep learning, graph kernel and factorization papers

Benedek Rozemberczki 4.5k Jan 01, 2023
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers Authors: Jaemin Cho, Abhay Zala, and Mohit Bansal (

Jaemin Cho 98 Dec 15, 2022
OOD Dataset Curator and Benchmark for AI-aided Drug Discovery

🔥 DrugOOD 🔥 : OOD Dataset Curator and Benchmark for AI Aided Drug Discovery This is the official implementation of the DrugOOD project, this is the

108 Dec 17, 2022
TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction TSDF++ is a novel multi-object TSDF formulation that can encode mult

ETHZ ASL 130 Dec 29, 2022
This framework implements the data poisoning method found in the paper Adversarial Examples Make Strong Poisons

Adversarial poison generation and evaluation. This framework implements the data poisoning method found in the paper Adversarial Examples Make Strong

31 Nov 01, 2022
[ICCV'2021] Image Inpainting via Conditional Texture and Structure Dual Generation

[ICCV'2021] Image Inpainting via Conditional Texture and Structure Dual Generation

Xiefan Guo 122 Dec 11, 2022
🥇 LG-AI-Challenge 2022 1위 솔루션 입니다.

LG-AI-Challenge-for-Plant-Classification Dacon에서 진행된 농업 환경 변화에 따른 작물 병해 진단 AI 경진대회 에 대한 코드입니다. (colab directory에 코드가 잘 정리 되어있습니다.) Requirements python

siwooyong 10 Jun 30, 2022
Neural Logic Inductive Learning

Neural Logic Inductive Learning This is the implementation of the Neural Logic Inductive Learning model (NLIL) proposed in the ICLR 2020 paper: Learn

36 Nov 28, 2022
Spatial Single-Cell Analysis Toolkit

Single-Cell Image Analysis Package Scimap is a scalable toolkit for analyzing spatial molecular data. The underlying framework is generalizable to spa

Laboratory of Systems Pharmacology @ Harvard 30 Nov 08, 2022
A deep learning network built with TensorFlow and Keras to classify gender and estimate age.

Convolutional Neural Network (CNN). This repository contains a source code of a deep learning network built with TensorFlow and Keras to classify gend

Pawel Dziemiach 1 Dec 19, 2021
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.

AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.

Adelaide Intelligent Machines (AIM) Group 3k Jan 02, 2023
Library for implementing reservoir computing models (echo state networks) for multivariate time series classification and clustering.

Framework overview This library allows to quickly implement different architectures based on Reservoir Computing (the family of approaches popularized

Filippo Bianchi 249 Dec 21, 2022