A high-level yet extensible library for fast language model tuning via automatic prompt search

Last update: Dec 07, 2022

Related tags

Overview

ruPrompts

ruPrompts is a high-level yet extensible library for fast language model tuning via automatic prompt search, featuring integration with HuggingFace Hub, configuration system powered by Hydra, and command line interface.

Prompt is a text instruction for language model, like

Translate English to French:
cat =>

For some tasks the prompt is obvious, but for some it isn't. With ruPrompts you can define only the prompt format, like {text}, and train it automatically for any task, if you have a training dataset.

You can currently use ruPrompts for text-to-text tasks, such as summarization, detoxification, style transfer, etc., and for styled text generation, as a special case of text-to-text.

Features

Modular structure for convenient extensibility
Integration with HF Transformers, support for all models with LM head
Integration with HF Hub for sharing and loading pretrained prompts
CLI and configuration system powered by Hydra
Pretrained prompts for ruGPT-3

Installation

ruPrompts can be installed with pip:

pip install ruprompts[hydra]

See Installation for other installation options.

Usage

Loading a pretrained prompt for styled text generation:

>> ppln_joke("Говорит кружка ложке") [{"generated_text": 'Говорит кружка ложке: "Не бойся, не утонешь!".'}]">

>>> import ruprompts
>>> from transformers import pipeline

>>> ppln_joke = pipeline("text-generation-with-prompt", prompt="konodyuk/prompt_rugpt3large_joke")
>>> ppln_joke("Говорит кружка ложке")
[{"generated_text": 'Говорит кружка ложке: "Не бойся, не утонешь!".'}]

For text2text tasks:

>> ppln_detox("Опять эти тупые дятлы все испортили, чтоб их черти взяли") [{"generated_text": 'Опять эти люди все испортили'}]">

>>> ppln_detox = pipeline("text2text-generation-with-prompt", prompt="konodyuk/prompt_rugpt3large_detox_russe")
>>> ppln_detox("Опять эти тупые дятлы все испортили, чтоб их черти взяли")
[{"generated_text": 'Опять эти люди все испортили'}]

Proceed to Quick Start for a more detailed introduction or start using ruPrompts right now with our Colab Tutorials.

License

ruPrompts is Apache 2.0 licensed. See the LICENSE file for details.

A high-level yet extensible library for fast language model tuning via automatic prompt search

Related tags

Overview

ruPrompts

Features

Installation

Usage

License

Owner

Sber AI

Recognition of 38 speech commands in russian. Based on Yandex Cup 2021 ML Challenge: ASR

State-of-the-art NLP through transformer models in a modular design and consistent APIs.

HiFi DeepVariant + WhatsHap workflowHiFi DeepVariant + WhatsHap workflow

A BERT-based reverse dictionary of Korean proverbs

Perform sentiment analysis and keyword extraction on Craigslist listings

ttslearn: Library for Pythonで学ぶ音声合成 (Text-to-speech with Python)

[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

自然言語で書かれた時間情報表現を抽出/規格化するルールベースの解析器

This project is part of Eleuther AI's quest to create a massive repository of high quality text data for training language models.

A desktop GUI providing an audio interface for GPT3.

端到端的长本文摘要模型（法研杯2020司法摘要赛道）

Graphical user interface for Argos Translate

Knowledge Oriented Programming Language

Python module (C extension and plain python) implementing Aho-Corasick algorithm

gaiic2021-track3-小布助手对话短文本语义匹配复赛rank3、决赛rank4

Universal End2End Training Platform, including pre-training, classification tasks, machine translation, and etc.

A telegram bot to translate 100+ Languages

This project consists of data analysis and data visualization (done using python)of all IPL seasons from 2008 to 2019 and answering the most asked questions about the IPL.

Script and models for clustering LAION-400m CLIP embeddings.

Retraining OpenAI's GPT-2 on Discord Chats