mkultra

mkultra is a prompt tuning toolkit for GPT-2 and GPT-Neo.

Prompt tuning injects a string of 20-100 special tokens into the context in order to influence text generation. These tokens are trained on a corpus much like a finetune, but take up a fraction of the space. The Neuromancer example is only 401kb for 100 tokens.

Read the original paper: https://arxiv.org/abs/2104.08691

Text Generation

model = GPT2SoftPromptLM.from_pretrained("gpt2")
tokenizer = GPT2SPTokenizerFast.from_pretrained("gpt2")
generator = pipeline('text-generation', model=model, tokenizer=tokenizer)

sp = SoftPrompt.from_file("sample_sps/finetune/neuromancer_gpt2.json")
prompt = sp + "The sky over the port"
output = generator(prompt)

SoftPrompts can be concatenated at any point into your context as if they were strings. When the context is printed, SoftPrompts show up as human-readable tags for debugging. They also tokenize to the underlying number of tokens for easy budgeting.

See the text generation notebook for pointers on adding mkultra to your generator.

Training

For finetune-like soft prompts, the finetune notebook demonstrates training on a corpus.

For AI text adventures or writing, the World Info notebook notebook demonstrates tuning a soft prompt to describe a character or setting. This is highly experimental.

Limitations (for now)

The Huggingface Trainer class should work as long as you set params=[model.get_soft_params()] on the optimizer, but it will still save full model checkpoints.
mkultra syncs a set of special tokens between its tokenizers the scenes. Adding your own tokens may result in unexpected behaviour.

Prompt tuning toolkit for GPT-2 and GPT-Neo

Related tags

Overview

mkultra

Text Generation

Training

Limitations (for now)

Owner

ACL'22: Structured Pruning Learns Compact and Accurate Models

Google and Stanford University released a new pre-trained model called ELECTRA

Extract Keywords from sentence or Replace keywords in sentences.

Just a basic Telegram AI chat bot written in Python using Pyrogram.

Estimation of the CEFR complexity score of a given word, sentence or text.

Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers

A linter to manage all your python exceptions and try/except blocks (limited only for those who like dinosaurs).

Enterprise Scale NLP with Hugging Face & SageMaker Workshop series

Large-scale Knowledge Graph Construction with Prompting

Maha is a text processing library specially developed to deal with Arabic text.

What are the best Systems? New Perspectives on NLP Benchmarking

This project deals with a simplified version of a more general problem of Aspect Based Sentiment Analysis.

Conditional probing: measuring usable information beyond a baseline

This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.

Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

KoBERTopic은 BERTopic을 한국어 데이터에 적용할 수 있도록 토크나이저와 BERT를 수정한 코드입니다.

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

LewusBot - Twitch ChatBot built in python with twitchio library

BeautyNet is an AI powered model which can tell you whether you're beautiful or not.

A Paper List for Speech Translation