EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

Related tags

Deep LearningEncT5
Overview

EncT5

(Unofficial) Pytorch Implementation of EncT5: Fine-tuning T5 Encoder for Non-autoregressive Tasks

About

  • Finetune T5 model for classification & regression by only using the encoder layers.
  • Implemented of Tokenizer and Model for EncT5.
  • Add BOS Token () for tokenizer, and use this token for classification & regression.
    • Need to resize embedding as vocab size is changed. (model.resize_token_embeddings())
  • BOS and EOS token will be automatically added as below.
    • single sequence: X
    • pair of sequences: A B

Requirements

Highly recommend to use the same version of transformers.

transformers==4.15.0
torch==1.8.1
sentencepiece==0.1.96
datasets==1.17.0
scikit-learn==0.24.2

How to Use

from enc_t5 import EncT5ForSequenceClassification, EncT5Tokenizer

model = EncT5ForSequenceClassification.from_pretrained("t5-base")
tokenizer = EncT5Tokenizer.from_pretrained("t5-base")

# Resize embedding size as we added bos token
if model.config.vocab_size < len(tokenizer.get_vocab()):
    model.resize_token_embeddings(len(tokenizer.get_vocab()))

Finetune on GLUE

Setup

  • Use T5 1.1 base for finetuning.
  • Evaluate on TPU. See run_glue_tpu.sh for more details.
  • Use AdamW optimizer instead of Adafactor.
  • Check best checkpoint on every epoch by using EarlyStoppingCallback.

Results

Metric Result (Paper) Result (Implementation)
CoLA Matthew 53.1 52.4
SST-2 Acc 94.0 94.5
MRPC F1/Acc 91.5/88.3 91.7/88.0
STS-B PCC/SCC 80.5/79.3 88.0/88.3
QQP F1/Acc 72.9/89.8 88.4/91.3
MNLI Mis/Matched 88.0/86.7 87.5/88.1
QNLI Acc 93.3 93.2
RTE Acc 67.8 69.7
You might also like...
Black-Box-Tuning - Black-Box Tuning for Language-Model-as-a-Service

Black-Box-Tuning Source code for paper "Black-Box Tuning for Language-Model-as-a

Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

xTune Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning. Environment DockerFile: dancingsoul/pytorch:xTune Install the f

 Cartoon-StyleGan2 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation
Cartoon-StyleGan2 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation

Fine-tuning StyleGAN2 for Cartoon Face Generation

Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World
Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World

Legged Robots that Keep on Learning Official codebase for Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World, whic

Fine-tuning StyleGAN2 for Cartoon Face Generation
Fine-tuning StyleGAN2 for Cartoon Face Generation

Cartoon-StyleGAN 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation Abstract Recent studies have shown remarkable success in the unsupervised imag

This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).
This repository is the official implementation of Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning (NeurIPS21).

Core-tuning This repository is the official implementation of ``Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regular

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker
Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker This repository contai

Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Transformer-vocabulary-transfer Implementation of the paper "Fine-Tuning Transfo

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning
Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning This repository is official Tensorflow implementation of paper: Ensemb

Comments
  • Enable tokenizer to be loaded by sentence-transformer

    Enable tokenizer to be loaded by sentence-transformer

    🚀 Feature Request

    Integration into sentence-transformer library.

    📎 Additional context

    I tried to load this tokenizer with sentence-transformer library but it failed. AutoTokenizer couldn't load this tokenizer. So, I simply added code to override save_pretrained and its dependencies so that this tokenizer is saved as T5Tokenizer, its super class.

            def save_pretrained(
            self,
            save_directory,
            legacy_format: Optional[bool] = None,
            filename_prefix: Optional[str] = None,
            push_to_hub: bool = False,
            **kwargs,
        ):
            if os.path.isfile(save_directory):
                logger.error(f"Provided path ({save_directory}) should be a directory, not a file")
                return
    
            if push_to_hub:
                commit_message = kwargs.pop("commit_message", None)
                repo = self._create_or_get_repo(save_directory, **kwargs)
    
            os.makedirs(save_directory, exist_ok=True)
    
            special_tokens_map_file = os.path.join(
                save_directory, (filename_prefix + "-" if filename_prefix else "") + SPECIAL_TOKENS_MAP_FILE
            )
            tokenizer_config_file = os.path.join(
                save_directory, (filename_prefix + "-" if filename_prefix else "") + TOKENIZER_CONFIG_FILE
            )
    
            tokenizer_config = copy.deepcopy(self.init_kwargs)
            if len(self.init_inputs) > 0:
                tokenizer_config["init_inputs"] = copy.deepcopy(self.init_inputs)
            for file_id in self.vocab_files_names.keys():
                tokenizer_config.pop(file_id, None)
    
            # Sanitize AddedTokens
            def convert_added_tokens(obj: Union[AddedToken, Any], add_type_field=True):
                if isinstance(obj, AddedToken):
                    out = obj.__getstate__()
                    if add_type_field:
                        out["__type"] = "AddedToken"
                    return out
                elif isinstance(obj, (list, tuple)):
                    return list(convert_added_tokens(o, add_type_field=add_type_field) for o in obj)
                elif isinstance(obj, dict):
                    return {k: convert_added_tokens(v, add_type_field=add_type_field) for k, v in obj.items()}
                return obj
    
            # add_type_field=True to allow dicts in the kwargs / differentiate from AddedToken serialization
            tokenizer_config = convert_added_tokens(tokenizer_config, add_type_field=True)
    
            # Add tokenizer class to the tokenizer config to be able to reload it with from_pretrained
            ############################################################################
            tokenizer_class = self.__class__.__base__.__name__
            ############################################################################
            # Remove the Fast at the end unless we have a special `PreTrainedTokenizerFast`
            if tokenizer_class.endswith("Fast") and tokenizer_class != "PreTrainedTokenizerFast":
                tokenizer_class = tokenizer_class[:-4]
            tokenizer_config["tokenizer_class"] = tokenizer_class
            if getattr(self, "_auto_map", None) is not None:
                tokenizer_config["auto_map"] = self._auto_map
            if getattr(self, "_processor_class", None) is not None:
                tokenizer_config["processor_class"] = self._processor_class
    
            # If we have a custom model, we copy the file defining it in the folder and set the attributes so it can be
            # loaded from the Hub.
            if self._auto_class is not None:
                custom_object_save(self, save_directory, config=tokenizer_config)
    
            with open(tokenizer_config_file, "w", encoding="utf-8") as f:
                f.write(json.dumps(tokenizer_config, ensure_ascii=False))
            logger.info(f"tokenizer config file saved in {tokenizer_config_file}")
    
            # Sanitize AddedTokens in special_tokens_map
            write_dict = convert_added_tokens(self.special_tokens_map_extended, add_type_field=False)
            with open(special_tokens_map_file, "w", encoding="utf-8") as f:
                f.write(json.dumps(write_dict, ensure_ascii=False))
            logger.info(f"Special tokens file saved in {special_tokens_map_file}")
    
            file_names = (tokenizer_config_file, special_tokens_map_file)
    
            save_files = self._save_pretrained(
                save_directory=save_directory,
                file_names=file_names,
                legacy_format=legacy_format,
                filename_prefix=filename_prefix,
            )
    
            if push_to_hub:
                url = self._push_to_hub(repo, commit_message=commit_message)
                logger.info(f"Tokenizer pushed to the hub in this commit: {url}")
    
            return save_files
    
    enhancement 
    opened by kwonmha 0
Releases(v1.0.0)
  • v1.0.0(Jan 22, 2022)

    What’s Changed

    :rocket: Features

    • Add GLUE Trainer (#2) @monologg
    • Add Template & EncT5 model and tokenizer (#1) @monologg

    :pencil: Documentation

    • Add readme & script (#3) @monologg
    Source code(tar.gz)
    Source code(zip)
Owner
Jangwon Park
Jangwon Park
Code accompanying the paper on "An Empirical Investigation of Domain Generalization with Empirical Risk Minimizers" published at NeurIPS, 2021

Code for "An Empirical Investigation of Domian Generalization with Empirical Risk Minimizers" (NeurIPS 2021) Motivation and Introduction Domain Genera

Meta Research 15 Dec 27, 2022
Orchestrating Distributed Materials Acceleration Platform Tutorial

Orchestrating Distributed Materials Acceleration Platform Tutorial This tutorial for orchestrating distributed materials acceleration platform was pre

BIG-MAP 1 Jan 25, 2022
Code in conjunction with the publication 'Contrastive Representation Learning for Hand Shape Estimation'

HanCo Dataset & Contrastive Representation Learning for Hand Shape Estimation Code in conjunction with the publication: Contrastive Representation Lea

Computer Vision Group, Albert-Ludwigs-Universität Freiburg 38 Dec 13, 2022
Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning using 🤗 transformers

hierarchical-transformer-1d Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning using 🤗 transformers In Progress!! 2021.

MyungHoon Jin 7 Nov 06, 2022
Repository for reproducing `Model-Based Robust Deep Learning`

Model-Based Robust Deep Learning (MBRDL) In this repository, we include the code necessary for reproducing the code used in Model-Based Robust Deep Le

Alex Robey 16 Sep 19, 2022
UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus

UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus General info This is

71 Oct 25, 2022
Procedural 3D data generation pipeline for architecture

Synthetic Dataset Generator Authors: Stanislava Fedorova Alberto Tono Meher Shashwat Nigam Jiayao Zhang Amirhossein Ahmadnia Cecilia bolognesi Dominik

Computational Design Institute 49 Nov 25, 2022
A set of Deep Reinforcement Learning Agents implemented in Tensorflow.

Deep Reinforcement Learning Agents This repository contains a collection of reinforcement learning algorithms written in Tensorflow. The ipython noteb

Arthur Juliani 2.2k Jan 01, 2023
Data Engineering ZoomCamp

Data Engineering ZoomCamp I'm partaking in a Data Engineering Bootcamp / Zoomcamp and will be tracking my progress here. I can't promise these notes w

Aaron 61 Jan 06, 2023
Your interactive network visualizing dashboard

Your interactive network visualizing dashboard Documentation: Here What is Jaal Jaal is a python based interactive network visualizing tool built usin

Mohit 177 Jan 04, 2023
An example to implement a new backbone with OpenMMLab framework.

Backbone example on OpenMMLab framework English | 简体中文 Introduction This is an template repo about how to use OpenMMLab framework to develop a new bac

Ma Zerun 22 Dec 29, 2022
A method that utilized Generative Adversarial Network (GAN) to interpret the black-box deep image classifier models by PyTorch.

A method that utilized Generative Adversarial Network (GAN) to interpret the black-box deep image classifier models by PyTorch.

Yunxia Zhao 3 Dec 29, 2022
A library for using chemistry in your applications

Chemistry in python Resources Used The following items are not made by me! Click the words to go to the original source Periodic Tab Json - Used in -

Tech Penguin 28 Dec 17, 2021
Official repository for the NeurIPS 2021 paper Get Fooled for the Right Reason: Improving Adversarial Robustness through a Teacher-guided curriculum Learning Approach

Get Fooled for the Right Reason Official repository for the NeurIPS 2021 paper Get Fooled for the Right Reason: Improving Adversarial Robustness throu

Sowrya Gali 1 Apr 25, 2022
A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

A Benchmark for Rough Sketch Cleanup This is the code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Va

33 Dec 18, 2022
A new version of the CIDACS-RL linkage tool suitable to a cluster computing environment.

Fully Distributed CIDACS-RL The CIDACS-RL is a brazillian record linkage tool suitable to integrate large amount of data with high accuracy. However,

Robespierre Pita 5 Nov 04, 2022
Rainbow DQN implementation that outperforms the paper's results on 40% of games using 20x less data 🌈

Rainbow 🌈 An implementation of Rainbow DQN which outperforms the paper's (Hessel et al. 2017) results on 40% of tested games while using 20x less dat

Dominik Schmidt 31 Dec 21, 2022
Yas CRNN model training - Yet Another Genshin Impact Scanner

Yas-Train Yet Another Genshin Impact Scanner 又一个原神圣遗物导出器 介绍 该仓库为 Yas 的模型训练程序 相关资料 MobileNetV3 CRNN 使用 假设你会设置基本的pytorch环境。 生成数据集 python main.py gen 训练

wormtql 18 Jan 08, 2023
OpenMatch: Open-set Consistency Regularization for Semi-supervised Learning with Outliers (NeurIPS 2021)

OpenMatch: Open-set Consistency Regularization for Semi-supervised Learning with Outliers (NeurIPS 2021) This is an PyTorch implementation of OpenMatc

Vision and Learning Group 38 Dec 26, 2022
PyTorch implementation of the Crafting Better Contrastive Views for Siamese Representation Learning

Crafting Better Contrastive Views for Siamese Representation Learning This is the official PyTorch implementation of the ContrastiveCrop paper: @artic

249 Dec 28, 2022