K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (EMNLP Founding 2021)

Last update: Nov 16, 2022

Related tags

Overview

Introduction

K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce.

Installation

PyTorch version >= 1.5.0
Python version >= 3.6

git clone https://github.com/pytorch/fairseq.git
cd fairseq 
pip install --editable ./

Pre-training

prepare data for pre-training train.sh

export CUDA_VISIBLE_DEVICES=0,1,2,3

function join_by { local IFS="$1"; shift; echo "$*"; }
DATA_DIR=$(join_by : data/kplug/bin/part*)

USER_DIR=src
TOKENS_PER_SAMPLE=512
WARMUP_UPDATES=10000
PEAK_LR=0.0005
TOTAL_UPDATES=125000
#MAX_SENTENCES=8
MAX_SENTENCES=16
UPDATE_FREQ=16   # batch_size=update_freq*max_sentences*nGPU = 16*16*4 = 1024

SUB_TASK=mlm_clm_sentcls_segcls_titlegen 
## ablation task
#SUB_TASK=clm_sentcls_segcls_titlegen
#SUB_TASK=mlm_sentcls_segcls_titlegen
#SUB_TASK=mlm_clm_sentcls_segcls
#SUB_TASK=mlm_clm_segcls_titlegen
#SUB_TASK=mlm_clm_sentcls_titlegen

fairseq-train $DATA_DIR \
    --user-dir $USER_DIR \
    --task multitask_lm \
    --sub-task $SUB_TASK \
    --arch transformer_pretrain_base \
    --min-loss-scale=0.000001 \
    --sample-break-mode none \
    --tokens-per-sample $TOKENS_PER_SAMPLE \
    --criterion multitask_lm \
    --apply-bert-init \
    --max-source-positions 512 --max-target-positions 512 \
    --optimizer adam --adam-betas '(0.9, 0.98)' --adam-eps 1e-6 --clip-norm 0.0 \
    --lr-scheduler polynomial_decay --lr $PEAK_LR \
    --warmup-updates $WARMUP_UPDATES --total-num-update $TOTAL_UPDATES \
    --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01 \
    --max-sentences $MAX_SENTENCES --update-freq $UPDATE_FREQ \
    --ddp-backend=no_c10d \
    --tensorboard-logdir tensorboard \
    --classification-head-name pretrain_head --num-classes 40 \
    --tagging-head-name pretrain_tag_head --tag-num-classes 2 \
    --fp16

Fine-tuning and Inference

Finetuning on JDDC (Response Generation)

Finetuning on ECD Corpus (Response Retrieval)

Finetuning on JD Product Dataset (Abstractive Summarization)

Finetuning on MEPAVE Dataset (Sequence Tagging)

K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (EMNLP Founding 2021)

Related tags

Overview

Introduction

Installation

Pre-training

Fine-tuning and Inference

Owner

Xu Song

PyTorch implementation of the paper:A Convolutional Approach to Melody Line Identification in Symbolic Scores.

Dense Deep Unfolding Network with 3D-CNN Prior for Snapshot Compressive Imaging, ICCV2021 [PyTorch Code]

the code for our CVPR 2021 paper Bilateral Grid Learning for Stereo Matching Network [BGNet]

Use .csv files to record, play and evaluate motion capture data.

Tree-based Search Graph for Approximate Nearest Neighbor Search

This repository contains the re-implementation of our paper deSpeckNet: Generalizing Deep Learning Based SAR Image Despeckling

A pytorch implementation of faster RCNN detection framework (Use detectron2, it's a masterpiece)

Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition

R-Drop: Regularized Dropout for Neural Networks

AI Summer's complete catalog of articles

SigOpt wrappers for scikit-learn methods

The official implementation of ICCV paper "Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds".

Fully Convolutional Networks for Semantic Segmentation by Jonathan Long, Evan Shelhamer, and Trevor Darrell. CVPR 2015 and PAMI 2016.

Signals-backend - A suite of card games written in Python

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

World Models with TensorFlow 2

Sub-tomogram-Detection - Deep learning based model for Cyro ET Sub-tomogram-Detection

[CVPR 2022] Structured Sparse R-CNN for Direct Scene Graph Generation

A library for performing coverage guided fuzzing of neural networks

This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (EMNLP Founding 2021)

Related tags

Overview

Introduction

Installation

Pre-training

Fine-tuning and Inference

Owner

Xu Song

PyTorch implementation of the paper:A Convolutional Approach to Melody Line Identification in Symbolic Scores.

Dense Deep Unfolding Network with 3D-CNN Prior for Snapshot Compressive Imaging, ICCV2021 [PyTorch Code]

the code for our CVPR 2021 paper Bilateral Grid Learning for Stereo Matching Network [BGNet]

Use .csv files to record, play and evaluate motion capture data.

Tree-based Search Graph for Approximate Nearest Neighbor Search

This repository contains the re-implementation of our paper deSpeckNet: Generalizing Deep Learning Based SAR Image Despeckling

A pytorch implementation of faster RCNN detection framework (Use detectron2, it's a masterpiece)

Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition

R-Drop: Regularized Dropout for Neural Networks

AI Summer's complete catalog of articles

SigOpt wrappers for scikit-learn methods

The official implementation of ICCV paper "Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds".

Fully Convolutional Networks for Semantic Segmentation by Jonathan Long*, Evan Shelhamer*, and Trevor Darrell. CVPR 2015 and PAMI 2016.

Signals-backend - A suite of card games written in Python

SSL_SLAM2: Lightweight 3-D Localization and Mapping for Solid-State LiDAR (mapping and localization separated) ICRA 2021

World Models with TensorFlow 2

Sub-tomogram-Detection - Deep learning based model for Cyro ET Sub-tomogram-Detection

[CVPR 2022] Structured Sparse R-CNN for Direct Scene Graph Generation

A library for performing coverage guided fuzzing of neural networks

This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

Fully Convolutional Networks for Semantic Segmentation by Jonathan Long, Evan Shelhamer, and Trevor Darrell. CVPR 2015 and PAMI 2016.