This repository contains the code for the paper in EMNLP 2021: "HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression".

Last update: Mar 24, 2022

Overview

HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression

This repository contains the code for the paper in EMNLP 2021: "HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression".

Requirements

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Download checkpoints

Download the vocabulary file of BERT-base (uncased) from HERE, and put it into ./pretrained_ckpt/.
Download the pre-trained checkpoint of BERT-base (uncased) from HERE, and put it into ./pretrained_ckpt/.
Download the 2nd general distillation checkpoint of TinyBERT from HERE, and extract them into ./pretrained_ckpt/.

Prepare dataset

Download the GLUE dataset (containing MNLI) using the script in HERE, and put the files into ./dataset/glue/. Download the Amazon Reviews dataset from HERE, and extract it into ./dataset/amazon_review/

Train the teacher model (BERT$_{\rm B}$-single) from single-domain

bash train_domain.sh

Distill the student model (BERT$_{\rm S}$) with TinyBERT-KD from single-domain

bash finetune_domain.sh

Train the teacher model (HRKD-teacher) from multi-domain

bash train_multi_domain.sh

And then put the checkpoints to the specified directories (see the beginning of finetune_multi_domain.py for more details).

Distill the student model (BERT$_{\rm S}$) with our HRKD from multi-domain

bash finetune_multi_domain.sh

Reference

If you find this code helpful for your research, please cite the following paper.

@inproceedings{dong2021hrkd,
  title     = {{HRKD}: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression},
  author    = {Chenhe Dong and Yaliang Li and Ying Shen and Minghui Qiu},
  booktitle = {Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year      = {2021}
}

This repository contains the code for the paper in EMNLP 2021: "HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression".

Related tags

Overview

HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression

Requirements

Download checkpoints

Prepare dataset

Train the teacher model (BERT$_{\rm B}$-single) from single-domain

Distill the student model (BERT$_{\rm S}$) with TinyBERT-KD from single-domain

Train the teacher model (HRKD-teacher) from multi-domain

Distill the student model (BERT$_{\rm S}$) with our HRKD from multi-domain

Reference

Owner

Chenhe Dong

Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch

Algorithmic trading using machine learning.

RMTD: Robust Moving Target Defence Against False Data Injection Attacks in Power Grids

Do Smart Glasses Dream of Sentimental Visions? Deep Emotionship Analysis for Eyewear Devices

Breast cancer is been classified into benign tumour and malignant tumour.

A hand tracking demo made with mediapipe where you can control lights with pinching your fingers and moving your hand up/down.

Pytorch Implementation of "Contrastive Representation Learning for Exemplar-Guided Paraphrase Generation"

On the Adversarial Robustness of Visual Transformer

Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT).

LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

PyTorch implementation of SwAV (Swapping Assignments between Views)

Code for Overinterpretation paper Overinterpretation reveals image classification model pathologies

Implementation of Research Paper "Learning to Enhance Low-Light Image via Zero-Reference Deep Curve Estimation"

Nightmare-Writeup - Writeup for the Nightmare CTF Challenge from 2022 DiceCTF

This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their coordinates and detected labels.

The datasets and code of ACL 2021 paper "Aspect-Category-Opinion-Sentiment Quadruple Extraction with Implicit Aspects and Opinions".

All course materials for the Zero to Mastery Deep Learning with TensorFlow course.

Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021

Colab notebook and additional materials for Python-driven analysis of redlining data in Philadelphia