Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators

Last update: Jul 22, 2022

Related tags

Overview

Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators

This is our Pytorch implementation for the paper:

Peiyu Liu, Ze-Feng Gao, Wayne Xin Zhao, Zhi-Yuan Xie, Zhong-Yi Lu and Ji-Rong Wen(2021). Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators

Introduction

This paper presents a novel pre-trained language models (PLM) compression approach based on the matrix product operator (short as MPO) from quantum many-body physics. It can decompose an original matrix into central tensors (containing the core information) and auxiliary tensors (with only a small proportion of parameters). With the decomposed MPO structure, we propose a novel fine-tuning strategy by only updating the parameters from the auxiliary tensors, and design an optimization algorithm for MPO-based approximation over stacked network architectures. Our approach can be applied to the original or the compressed PLMs in a general way, which derives a lighter network and significantly reduces the parameters to be fine-tuned. Extensive experiments have demonstrated the effectiveness of the proposed approach in model compression, especially the reduction in fine-tuning parameters (91% reduction on average).

For more details about the technique of MPOP, refer to our paper

Release Notes

First version: 2021/05/21
add albert code: 2021/06/08

Requirements

python 3.7
torch >= 1.8.0

Installation

pip install mpo_lab

Lightweight fine-tuning

In lightweight fine-tuning, we use original ALBERT without fine-tuning as to be compressed. By performing MPO decomposition on each weight matrix, we obtain four auxiliary tensors and one central tensor per tensor set. This provides a good initialization for the task-specific distillation. Refer to run_all_albert_fine_tune.sh

Important arguments:

--data_dir          Path to load dataset
--mpo_lr            Learning rate of tensors produced by MPO
--mpo_layers        Name of components to be decomposed with MPO
--emb_trunc         Truncation number of the central tensor in word embedding layer
--linear_trunc      Truncation number of the central tensor in linear layer
--attention_trunc   Truncation number of the central tensor in attention layer
--load_layer        Name of components to be loaded from exist checkpoint file
--update_mpo_layer  Name of components to be update when training the model

Dimension squeezing

In Dimension squeezing, we compute approiate truncation order for the whole model. In order to re-produce the results in paper, we prepare the model after lightweight fine-tuning. Refer to run_all_albert_fine_tune.sh

albert models google drive

Acknowledgment

Any scientific publications that use our codes should cite the following paper as the reference:

@inproceedings{Liu-ACL-2021,
  author    = {Peiyu Liu and
               Ze{-}Feng Gao and
               Wayne Xin Zhao and
               Z. Y. Xie and
               Zhong{-}Yi Lu and
               Ji{-}Rong Wen},
  title     = "Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression
               based on Matrix Product Operators",
  booktitle = {{ACL}},
  year      = {2021},
}

TODO

prepare data and code
upload models in order to reproduce experiments
supplementary details for paper

Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators

Related tags

Overview

Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators

Introduction

Release Notes

Requirements

Installation

Lightweight fine-tuning

Dimension squeezing

Acknowledgment

TODO

Owner

RUCAIBox

Discovering and Achieving Goals via World Models

Code for reproducing our paper: LMSOC: An Approach for Socially Sensitive Pretraining

FANet - Real-time Semantic Segmentation with Fast Attention

Text2Art is an AI art generator powered with VQGAN + CLIP and CLIPDrawer models

Autonomous Ground Vehicle Navigation and Control Simulation Examples in Python

Repositorio de los Laboratorios de Análisis Numérico / Análisis Numérico I de FAMAF, UNC.

DLWP: Deep Learning Weather Prediction

BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training

Mapping Conditional Distributions for Domain Adaptation Under Generalized Target Shift

U-Time: A Fully Convolutional Network for Time Series Segmentation

A `Neural = Symbolic` framework for sound and complete weighted real-value logic

Implementation of our paper "DMT: Dynamic Mutual Training for Semi-Supervised Learning"

This is a Deep Leaning API for classifying emotions from human face and human audios.

Mixup for Supervision, Semi- and Self-Supervision Learning Toolbox and Benchmark

E-RAFT: Dense Optical Flow from Event Cameras

Simple image captioning model - CLIP prefix captioning.

Implementation of our paper "Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning".

Official implementation for (Show, Attend and Distill: Knowledge Distillation via Attention-based Feature Matching, AAAI-2021)

Machine Translation Implement By Bi-GRU And Transformer

AdelaiDepth is an open source toolbox for monocular depth prediction.