mkultra

mkultra is a prompt tuning toolkit for GPT-2 and GPT-Neo.

Prompt tuning injects a string of 20-100 special tokens into the context in order to influence text generation. These tokens are trained on a corpus much like a finetune, but take up a fraction of the space. The Neuromancer example is only 401kb for 100 tokens.

Read the original paper: https://arxiv.org/abs/2104.08691

Text Generation

model = GPT2SoftPromptLM.from_pretrained("gpt2")
tokenizer = GPT2SPTokenizerFast.from_pretrained("gpt2")
generator = pipeline('text-generation', model=model, tokenizer=tokenizer)

sp = SoftPrompt.from_file("sample_sps/finetune/neuromancer_gpt2.json")
prompt = sp + "The sky over the port"
output = generator(prompt)

SoftPrompts can be concatenated at any point into your context as if they were strings. When the context is printed, SoftPrompts show up as human-readable tags for debugging. They also tokenize to the underlying number of tokens for easy budgeting.

See the text generation notebook for pointers on adding mkultra to your generator.

Training

For finetune-like soft prompts, the finetune notebook demonstrates training on a corpus.

For AI text adventures or writing, the World Info notebook notebook demonstrates tuning a soft prompt to describe a character or setting. This is highly experimental.

Limitations (for now)

The Huggingface Trainer class should work as long as you set params=[model.get_soft_params()] on the optimizer, but it will still save full model checkpoints.
mkultra syncs a set of special tokens between its tokenizers the scenes. Adding your own tokens may result in unexpected behaviour.

Prompt tuning toolkit for GPT-2 and GPT-Neo

Related tags

Overview

mkultra

Text Generation

Training

Limitations (for now)

Owner

An implementation of WaveNet with fast generation

A number of methods in order to perform Natural Language Processing on live data derived from Twitter

Lattice methods in TensorFlow

Journey is a NLP-Powered Developer assistant

Text-Based zombie apocalyptic decision-making game in Python

Chinese Named Entity Recognization (BiLSTM with PyTorch)

Python port of Google's libphonenumber

PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。

IEEEXtreme15.0 Questions And Answers

Google and Stanford University released a new pre-trained model called ELECTRA

✔👉A Centralized WebApp to Ensure Road Safety by checking on with the activities of the driver and activating label generator using NLP.

Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Machine Psychology: Python Generated Art

Collection of useful (to me) python scripts for interacting with napari

Label data using HuggingFace's transformers and automatically get a prediction service

Ecco is a python library for exploring and explaining Natural Language Processing models using interactive visualizations.

BookNLP, a natural language processing pipeline for books

Code for PED: DETR For (Crowd) Pedestrian Detection

New Modeling The Background CodeBase