K-PLUG

Implementation of paper K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (Findings of EMNLP 2021), by Song Xu, Haoran Li, Peng Yuan, Yujia Wang, Youzheng Wu, Xiaodong He, Ying Liu, and Bowen Zhou. K-PLUG is a knowledge-injected pre-trained language model based on encoder-decoder framework that can be transferred to both natural language understanding and generation tasks.

Web Demo

Integrated into Huggingface Spaces 🤗 using Gradio. Try out the Web Demo

What's New:

2022/09 Released Demo at Huggingface Spaces.
2022/04 Released Pipeline for Shared Tasks in NLPCC 2022 Multimodal Summarization Challenge.
2022/03 Released M-KPLUG which injects the visual signals to the decoder layer.

Quick Start with Docker (fairseq)

GPU

git clone https://github.com/xu-song/k-plug.git
cd k-plug

nvidia-docker run -it --shm-size 15G --network=host -v $(pwd):/workspace/ bitspeech/fairseq:latest bash
sh finetune_cepsum_demo.sh  # 1. download model 2. finetune 3. inference 4. evaluation

CPU

git clone https://github.com/xu-song/k-plug.git
cd k-plug

docker run -it --network=host -v $(pwd):/workspace/ bitspeech/fairseq:latest bash
sh finetune_cepsum_demo.sh

Quick Start with HuggingFace

We also provide huggingface api.

Model Zoo

We provide kplug pretrain and finetune models.

Encoder: 6L-768H-12A, Decoder: 6L-768H-12A, 110M parameters.

Task	Task Type	Task Description	Model
pre-train	NLU & NLG	multi-task pre-train	kplug.pt
ft-MEPAVE	NLU	sequence tagging	kplug_ft_ave.pt
ft-ECD	NLU	retrieval based chatbot	kplug_ft_ecd.pt
ft-JDDC	NLG	generation based chatbot	kplug_ft_jddc.pt

Pre-training

prepare data for pre-training train.sh

export CUDA_VISIBLE_DEVICES=0,1,2,3

function join_by { local IFS="$1"; shift; echo "$*"; }
DATA_DIR=$(join_by : data/kplug/bin/part*)

USER_DIR=src
TOKENS_PER_SAMPLE=512
WARMUP_UPDATES=10000
PEAK_LR=0.0005
TOTAL_UPDATES=125000
#MAX_SENTENCES=8
MAX_SENTENCES=16
UPDATE_FREQ=16   # batch_size=update_freq*max_sentences*nGPU = 16*16*4 = 1024

SUB_TASK=mlm_clm_sentcls_segcls_titlegen 
## ablation task
#SUB_TASK=clm_sentcls_segcls_titlegen
#SUB_TASK=mlm_sentcls_segcls_titlegen
#SUB_TASK=mlm_clm_sentcls_segcls
#SUB_TASK=mlm_clm_segcls_titlegen
#SUB_TASK=mlm_clm_sentcls_titlegen

fairseq-train $DATA_DIR \
    --user-dir $USER_DIR \
    --task multitask_lm \
    --sub-task $SUB_TASK \
    --arch transformer_pretrain_base \
    --min-loss-scale=0.000001 \
    --sample-break-mode none \
    --tokens-per-sample $TOKENS_PER_SAMPLE \
    --criterion multitask_lm \
    --apply-bert-init \
    --max-source-positions 512 --max-target-positions 512 \
    --optimizer adam --adam-betas '(0.9, 0.98)' --adam-eps 1e-6 --clip-norm 0.0 \
    --lr-scheduler polynomial_decay --lr $PEAK_LR \
    --warmup-updates $WARMUP_UPDATES --total-num-update $TOTAL_UPDATES \
    --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01 \
    --max-sentences $MAX_SENTENCES --update-freq $UPDATE_FREQ \
    --ddp-backend=no_c10d \
    --tensorboard-logdir tensorboard \
    --classification-head-name pretrain_head --num-classes 40 \
    --tagging-head-name pretrain_tag_head --tag-num-classes 2 \
    --fp16

Fine-tuning and Inference

Finetuning on JDDC (Response Generation)

Finetuning on ECD Corpus (Response Retrieval)

Finetuning on CEPSUM Dataset (Abstractive Summarization)

Finetuning on MEPAVE Dataset (Sequence Tagging)

Dependencies

Currently we implement K-PLUG based on fairseq. The dependencies are as follows:

PyTorch version >= 1.5.0
Python version >= 3.6
fairseq

git clone https://github.com/pytorch/fairseq.git
cd fairseq 
pip install --editable ./
# python setup.py build_ext --inplace

Reference

If you use this code please cite our paper:

@inproceedings{xu-etal-2021-k-plug,
    title = "K-{PLUG}: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in {E}-Commerce",
    author = {Xu, Song and Li, Haoran and Yuan, Peng and Wang, Yujia and Wu, Youzheng and He, Xiaodong and Liu, Ying and Zhou, Bowen},
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    year = "2021",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-emnlp.1"
}

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data_sample		data_sample
examples		examples
fairseq_cli_ext		fairseq_cli_ext
models/hugging_face/kplug		models/hugging_face/kplug
src		src
tools		tools
.gitignore		.gitignore
README.md		README.md
finetune_cepsum_demo.sh		finetune_cepsum_demo.sh
inference_cepsum_demo.sh		inference_cepsum_demo.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data_sample

data_sample

examples

examples

fairseq_cli_ext

fairseq_cli_ext

models/hugging_face/kplug

models/hugging_face/kplug

src

src

tools

tools

.gitignore

.gitignore

README.md

README.md

finetune_cepsum_demo.sh

finetune_cepsum_demo.sh

inference_cepsum_demo.sh

inference_cepsum_demo.sh

Repository files navigation

K-PLUG

Web Demo

What's New:

Quick Start with Docker (fairseq)

Quick Start with HuggingFace

Model Zoo

Pre-training

Fine-tuning and Inference

Dependencies

Reference

About

Releases

Packages

Languages

xu-song/k-plug

Folders and files

Latest commit

History

Repository files navigation

K-PLUG

Web Demo

What's New:

Quick Start with Docker (fairseq)

Quick Start with HuggingFace

Model Zoo

Pre-training

Fine-tuning and Inference

Dependencies

Reference

About

Topics

Resources

Stars

Watchers

Forks

Languages