poutyne-transformers

Train 🤗 -transformers models with Poutyne.

Installation

pip install poutyne-transformers

Example

import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from datasets import load_dataset
from torch.utils.data import DataLoader
from torch import optim
from poutyne import Model
from poutyne_transformers import TransformerCollator, model_loss, ModelWrapper

print('Loading model & tokenizer.')
transformer = AutoModelForSequenceClassification.from_pretrained('distilbert-base-cased', num_labels=2, return_dict=True)
tokenizer = AutoTokenizer.from_pretrained('distilbert-base-cased')

print('Loading & preparing dataset.')
dataset = load_dataset("imdb")
dataset = dataset.map(lambda entry: tokenizer(entry['text'], add_special_tokens=True, padding='max_length', truncation=True), batched=True)
dataset = dataset.remove_columns(['text'])
dataset.set_format('torch')

collate_fn = TransformerCollator()
train_dataloader = DataLoader(dataset['train'], batch_size=16, collate_fn=collate_fn)
test_dataloader = DataLoader(dataset['test'], batch_size=16, collate_fn=collate_fn)

print('Preparing training.')
wrapped_transformer = ModelWrapper(transformer)
optimizer = optim.AdamW(wrapped_transformer.parameters(), lr=5e-5)
device = torch.device('cuda:0' if torch.cuda.is_available() else "cpu")
model = Model(wrapped_transformer, optimizer, loss_function=model_loss, device=device)

print('Starting training.')
model.fit_generator(train_dataloader, test_dataloader, epochs=1)

Train 🤗-transformers model with Poutyne.

Related tags

Overview

poutyne-transformers

Installation

Example

Owner

Lennart Keller

NLP-Project - Used an API to scrape 2000 reddit posts, then used NLP analysis and created a classification model to mixed succcess

Semantic search for quotes.

Applying "Load What You Need: Smaller Versions of Multilingual BERT" to LaBSE

Library for Russian imprecise rhymes generation

A Python 3.6+ package to run .many files, where many programs written in many languages may exist in one file.

Fast, general, and tested differentiable structured prediction in PyTorch

Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech

ACL22 paper: Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost

A PyTorch implementation of paper "Learning Shared Semantic Space for Speech-to-Text Translation", ACL (Findings) 2021

[ICLR 2021 Spotlight] Pytorch implementation for "Long-tailed Recognition by Routing Diverse Distribution-Aware Experts."

Research code for the paper "Fine-tuning wav2vec2 for speaker recognition"

DiY Oxygen Concentrator based on the OxiKit

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

This is my reading list for my PhD in AI, NLP, Deep Learning and more.

Repository to hold code for the cap-bot varient that is being presented at the SIIC Defence Hackathon 2021.

Sinkhorn Transformer - Practical implementation of Sparse Sinkhorn Attention

SEJE is a prototype for the paper Learning Text-Image Joint Embedding for Efficient Cross-Modal Retrieval with Deep Feature Engineering.

Snips Python library to extract meaning from text

中文无监督SimCSE Pytorch实现

Klexikon: A German Dataset for Joint Summarization and Simplification