Knowledge-Inheritance

Source code paper: Knowledge Inheritance for Pre-trained Language Models (preprint). The trained model parameters (in Fairseq format) can be downloaded from Tsinghua Cloud. You can use convert_fairseq_to_huggingface.py to convert the Fairseq format into Huggingface's transformers format easily.

We refer the downstream performance evaluation to the implementation of Fairseq (GLUE tasks) and Don't Stop Pre-training (ACL-ARC / CHEMPROT).

If you have any question, feel free to contact us ([email protected]).

1. Available Pretrained Models

WB domain: Wikipedia + BookCorpus; CS domain: computer science papers; BIO domain: biomedical papers;

Models trained by self-learning

RoBERTa_WB_H_4
RoBERTa_WB_H_6
RoBERTa_WB_H_8
RoBERTa_WB_H_10
RoBERTa_WB_D_288
RoBERTa_WB_D_384
RoBERTa_WB_D_480
RoBERTa_WB_D_576
RoBERTa_WB_D_672
RoBERTa_WB_BASE
RoBERTa_WB_MEDIUM
RoBERTa_WB_BASE_PLUS
RoBERTa_WB_LARGE
GPT_WB_MEDIUM
GPT_WB_BASE
GPT_WB_BASE_PLUS
RoBERTa_CS_MEDIUM
RoBERTa_CS_BASE
RoBERTa_BIO_MEDIUM
RoBERTa_BIO_BASE

Models trained by Knowledge Inheritance

RoBERTa_WB_BASE -> RoBERTa_WB_BASE_PLUS
RoBERTa_WB_BASE -> RoBERTa_WB_LARGE
RoBERTa_WB_BASE_PLUS -> RoBERTa_WB_LARGE
RoBERTa_WB_BASE -> RoBERTa_WB_BASE_PLUS -> RoBERTa_WB_LARGE

Source code for paper: Knowledge Inheritance for Pre-trained Language Models

Related tags

Overview

Knowledge-Inheritance

1. Available Pretrained Models

Models trained by self-learning

Models trained by Knowledge Inheritance

Owner

THUNLP

Implementation of SETR model, Original paper: Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers.

Supervised forecasting of sequential data in Python.

PyTorch implementation of the implicit Q-learning algorithm (IQL)

Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechanism for Generalized Face Presentation Attack Detection

Implementation of "Learning to Match Features with Seeded Graph Matching Network" ICCV2021

Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021)

Real life contra a deep learning project built using mediapipe and openc

Codes for AAAI22 paper "Learning to Solve Travelling Salesman Problem with Hardness-Adaptive Curriculum"

Implementation for Shape from Polarization for Complex Scenes in the Wild

Joint Detection and Identification Feature Learning for Person Search

Code for Motion Representations for Articulated Animation paper

A coin flip game in which you can put the amount of money below or equal to 1000 and then choose heads or tail

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective

Multiple custom object count and detection using YOLOv3-Tiny method

Official implementation of Neural Bellman-Ford Networks (NeurIPS 2021)

A convolutional recurrent neural network for classifying A/B phases in EEG signals recorded for sleep analysis.

A Home Assistant custom component for Lobe. Lobe is an AI tool that can classify images.

Deep Ensemble Learning with Jet-Like architecture

DumpSMBShare - A script to dump files and folders remotely from a Windows SMB share

PyTorch implementation of Federated Learning with Non-IID Data, and federated learning algorithms, including FedAvg, FedProx.