Sequence-Labeling-Early-Exit

Code for ACL 2021 paper: Accelerating BERT Inference for Sequence Labeling via Early-Exit

Requirement:

Please refer to requirements.txt

How to run?

For ontonotes (CN):

you should claim your dataset path in paths.py, and then

For the first stage training:

python -u main.py --device 0  --seed 100 --fast_ptm_name bert --lr 5e-5  --use_crf 0 --dataset ontonotes_cn --fix_ptm_epoch 2 --warmup_step 3000 --use_fastnlp_bert 0 --sampler bucket  --after_bert linear --use_char 0 --use_bigram 0 --gradient_clip_norm_other 5 --gradient_clip_norm_bert 1 --train_mode joint --test_mode joint --if_save 1 --warmup_schedule inverse_square --epoch 20 --joint_weighted 1 --ptm_lr_rate 0.1 --cls_common_lr_scale 0

Then find the exp_path in the corresponding fitlog entry, and self-sampling further train the model.

For the self-sampling training:

python -u further_train.py --seed 100 --msg fuxian --if_save 1 --warmup_schedule inverse_square --epoch 30 --keep_norm_same 1 --sandwich_small 2 --sandwich_full 4 --max_t_level_t -0.5 --train_mode joint_sample_copy --further 0 --flooding 1 --flooding_bias 0 --lr 1e-4 --ptm_lr_rate 0.1 --fix_ptm_epoch 2 --min_win_size 5 --copy_wordpiece all --ckpt_epoch 7 --exp_path 05_11_22_20_52.210103 --device 2 --max_threshold 0.25 --max_threshold_2 0.5

Then find the exp_path and best epoch in the corresponding fitlog entry, and use it for early-exit inference as:

speed 2X:
python test.py --device 2 --further 1 --record_flops 1 --win_size 15 --threshold 0.1 --ckpt_epoch [ckpt_path] --exp_path [exp_path]
speed 3X:
python test.py --device 2 --further 1 --record_flops 1 --win_size 5 --threshold 0.15 --ckpt_epoch [ckpt_path] --exp_path [exp_path]
speed 4X:
python test.py --device 2 --further 1 --record_flops 1 --win_size 5 --threshold 0.25 --ckpt_epoch [ckpt_path] --exp_path [exp_path]

Other datasets' scripts coming soon

If you have any question, do not hesitate to ask it in issue. (English or Chinese both ok)

Accelerating BERT Inference for Sequence Labeling via Early-Exit

Related tags

Overview

Sequence-Labeling-Early-Exit

Requirement:

How to run?

Owner

李孝男

The Python code for the paper A Hybrid Quantum-Classical Algorithm for Robust Fitting

One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking

Self-Adaptable Point Processes with Nonparametric Time Decays

Self-Learned Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence

Naszilla is a Python library for neural architecture search (NAS)

Graph neural network message passing reframed as a Transformer with local attention

MODNet: Trimap-Free Portrait Matting in Real Time

A more easy-to-use implementation of KPConv

This repository contains the implementation of the paper: "Towards Frequency-Based Explanation for Robust CNN"

NAS-FCOS: Fast Neural Architecture Search for Object Detection (CVPR 2020)

Anomaly detection related books, papers, videos, and toolboxes

Author's PyTorch implementation of Randomized Ensembled Double Q-Learning (REDQ) algorithm.

This repository provides an unified frameworks to train and test the state-of-the-art few-shot font generation (FFG) models.

This is the source code for: Context-aware Entity Typing in Knowledge Graphs.

[CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation

This git repo contains the implementation of my ML project on Heart Disease Prediction

This repository is the offical Pytorch implementation of ContextPose: Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021).

Keywords : Streamlit, BertTokenizer, BertForMaskedLM, Pytorch

A clean and extensible PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners

Official Code Release for "CLIP-Adapter: Better Vision-Language Models with Feature Adapters"