K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (EMNLP Founding 2021)

Last update: Nov 16, 2022

Related tags

Overview

Introduction

K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce.

Installation

PyTorch version >= 1.5.0
Python version >= 3.6

git clone https://github.com/pytorch/fairseq.git
cd fairseq 
pip install --editable ./

Pre-training

prepare data for pre-training train.sh

export CUDA_VISIBLE_DEVICES=0,1,2,3

function join_by { local IFS="$1"; shift; echo "$*"; }
DATA_DIR=$(join_by : data/kplug/bin/part*)

USER_DIR=src
TOKENS_PER_SAMPLE=512
WARMUP_UPDATES=10000
PEAK_LR=0.0005
TOTAL_UPDATES=125000
#MAX_SENTENCES=8
MAX_SENTENCES=16
UPDATE_FREQ=16   # batch_size=update_freq*max_sentences*nGPU = 16*16*4 = 1024

SUB_TASK=mlm_clm_sentcls_segcls_titlegen 
## ablation task
#SUB_TASK=clm_sentcls_segcls_titlegen
#SUB_TASK=mlm_sentcls_segcls_titlegen
#SUB_TASK=mlm_clm_sentcls_segcls
#SUB_TASK=mlm_clm_segcls_titlegen
#SUB_TASK=mlm_clm_sentcls_titlegen

fairseq-train $DATA_DIR \
    --user-dir $USER_DIR \
    --task multitask_lm \
    --sub-task $SUB_TASK \
    --arch transformer_pretrain_base \
    --min-loss-scale=0.000001 \
    --sample-break-mode none \
    --tokens-per-sample $TOKENS_PER_SAMPLE \
    --criterion multitask_lm \
    --apply-bert-init \
    --max-source-positions 512 --max-target-positions 512 \
    --optimizer adam --adam-betas '(0.9, 0.98)' --adam-eps 1e-6 --clip-norm 0.0 \
    --lr-scheduler polynomial_decay --lr $PEAK_LR \
    --warmup-updates $WARMUP_UPDATES --total-num-update $TOTAL_UPDATES \
    --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01 \
    --max-sentences $MAX_SENTENCES --update-freq $UPDATE_FREQ \
    --ddp-backend=no_c10d \
    --tensorboard-logdir tensorboard \
    --classification-head-name pretrain_head --num-classes 40 \
    --tagging-head-name pretrain_tag_head --tag-num-classes 2 \
    --fp16

Fine-tuning and Inference

Finetuning on JDDC (Response Generation)

Finetuning on ECD Corpus (Response Retrieval)

Finetuning on JD Product Dataset (Abstractive Summarization)

Finetuning on MEPAVE Dataset (Sequence Tagging)

K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (EMNLP Founding 2021)

Related tags

Overview

Introduction

Installation

Pre-training

Fine-tuning and Inference

Owner

Xu Song

【CVPR 2021, Variational Inference Framework, PyTorch】 From Rain Generation to Rain Removal

Guiding evolutionary strategies by (inaccurate) differentiable robot simulators @ NeurIPS, 4th Robot Learning Workshop

The world's simplest facial recognition api for Python and the command line

A Pytree Module system for Deep Learning in JAX

This is the source code for: Context-aware Entity Typing in Knowledge Graphs.

This repo implements several applications of the proposed generalized Bures-Wasserstein (GBW) geometry on symmetric positive definite matrices.

Simple and Robust Loss Design for Multi-Label Learning with Missing Labels

Codes for CyGen, the novel generative modeling framework proposed in "On the Generative Utility of Cyclic Conditionals" (NeurIPS-21)

Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Accuracy Aligned. Concise Implementation of Swin Transformer

Python package for downloading ECMWF reanalysis data and converting it into a time series format.

A simple Python library for stochastic graphical ecological models

Official implementation of the PICASO: Permutation-Invariant Cascaded Attentional Set Operator

A library for researching neural networks compression and acceleration methods.

Towards the D-Optimal Online Experiment Design for Recommender Selection (KDD 2021)

Real-time VIBE: Frame by Frame Inference of VIBE (Video Inference for Human Body Pose and Shape Estimation)

Neural Contours: Learning to Draw Lines from 3D Shapes (CVPR2020)

Meta Learning for Semi-Supervised Few-Shot Classification

A python interface for training Reinforcement Learning bots to battle on pokemon showdown

CSPML (crystal structure prediction with machine learning-based element substitution)