API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend

Overview

gpt-j-api 🦜

GitHub release (latest by date) Python version API up

An API to interact with the GPT-J language model. You can use and test the model in two different ways:

Using the API

  • Python:
import requests
context = "In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English."
payload = {
    "context": context,
    "token_max_length": 512,
    "temperature": 1.0,
    "top_p": 0.9,
}
response = requests.post("http://api.vicgalle.net:5000/generate", params=payload).json()
print(response)
  • Bash:
curl -X 'POST' \
  'http://api.vicgalle.net:5000/generate?context=In%20a%20shocking%20finding%2C%20scientists%20discovered%20a%20herd%20of%20unicorns%20living%20in%20a%20remote%2C%20previously%20unexplored%20valley%2C%20in%20the%20Andes%20Mountains.%20Even%20more%20surprising%20to%20the%20researchers%20was%20the%20fact%20that%20the%20unicorns%20spoke%20perfect%20English.&token_max_length=512&temperature=1&top_p=0.9' \
  -H 'accept: application/json' \
  -d ''

Deployment of the API server

Just ssh into a TPU VM. This code was only tested on the v3-8 variants.

First, install the requirements and get the weigts:

python3 -m pip install -r requirements.txt
wget https://the-eye.eu/public/AI/GPT-J-6B/step_383500_slim.tar.zstd
sudo apt install zstd
tar -I zstd -xf step_383500_slim.tar.zstd

And just run

python3 serve.py

Then, you can go to http://localhost:5000/docs and use the API!

Deploy the streamlit dashboard

Just run

python3 -m streamlit run streamlit_app.py --server.port 8000

Acknowledgements

Thanks to the support of the TPU Research Cloud, https://sites.research.google/trc/

Comments
  • I've made an extensions using this api

    I've made an extensions using this api

    https://chrome.google.com/webstore/detail/type-j/femdhcgkiiagklmickakfoogeehbjnbh

    You can check it out here

    First i was very hyped up and it felt fun, like I was talking to a machine, but then I lost my enthusiasm and now I feel like it's totally useless xD

    I'm just leaving a link here for you to appreciate you, it became real thanks for you posting this api

    feel free to delete the issue as it's out of scope

    if you got ideas on how to make it commercially succesful - i'll be happy to partner up

    peace

    opened by oogxdd 5
  • Illegal Instruction

    Illegal Instruction

    When installing like described in the readme (fresh conda env,python=3.8, ubuntu) I'll get a illegal instruction immediately after running python serve.py

    (gpt-j-api) […]@[…]:/opt/GPT/gpt-j-api$ python -q -X faulthandler serve.py
    Fatal Python error: Illegal instruction
    
    Current thread 0x00007f358d7861c0 (most recent call first):
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap_external>", line 1166 in create_module
      File "<frozen importlib._bootstrap>", line 556 in module_from_spec
      File "<frozen importlib._bootstrap>", line 657 in _load_unlocked
      File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 991 in _find_and_load
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
      File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jaxlib/xla_client.py", lin
    e 31 in <module>
      File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 991 in _find_and_load
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
      File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jax/lib/__init__.py", line 58 in <module>
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap_external>", line 843 in exec_module
      File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
      File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 991 in _find_and_load
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
      File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jax/config.py", line 26 in <module>
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap_external>", line 843 in exec_module
      File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
      File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 991 in _find_and_load
      File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jax/__init__.py", line 33 in <module>
      File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
      File "<frozen importlib._bootstrap_external>", line 843 in exec_module
      File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
      File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
      File "<frozen importlib._bootstrap>", line 991 in _find_and_load
      File "serve.py", line 3 in <module>
    Illegal instruction (core dumped)
    

    EDIT

    running this on CPU only, I tried installing jax[CPU] - same resut

    opened by chris-aeviator 4
  • api seems offline

    api seems offline

    When I try to access the API I get the following error: ERR_CONNECTION_TIMED_OUT. But when I try to connect to it using a different IP address it does work. Am I IP banned?

    opened by KoenTech 4
  • Usage

    Usage

    I'm using this to host a Discord chatbot, and though I have slowmode on the channel there's still a lot of usage, and often the API is being used as fast as it can generate completions. Will this harm the experience for others? Should I limit it more? (thanks for making this free but I don't want to take advantage of that too much if it's bad for others)

    opened by Heath123 3
  • Alternative to Google TPU VM?

    Alternative to Google TPU VM?

    Hello,

    I would like to run a local instance of GPT-J, but avoid using Google.

    I have little to no experience in machine learning and its requirements, are there other solutions I could use? (What are the requirements for a machine in order to run GPT-J?)

    Thank you very much!

    opened by birkenbaum 3
  • Is there a way to speed up inference?

    Is there a way to speed up inference?

    Hello, I am currently working on a project where I need quick inference. It needn't be real-time, but something around 7-10 sec would be great. Is there a way to speed up the inference using the API?

    The model does not seem to be a problem as compute_time is around 8sec, but by the time the request arrives it takes around 20 seconds (over 30 on some occasions). Is there a way to make the request a bit faster?

    Thanks,

    opened by Aryagm 2
  • Errno 111

    Errno 111

    Could anyone please fix the following error? Thanks a lot.

    "ConnectionError: ...Failed to establish a new connection: [Errno 111] Connection refused"

    opened by Mather10 1
  • How to make the api public?

    How to make the api public?

    Hey, I was able to get serve.py running with the instructions you gave. But now I want to make the api public and connect it to a domain name so it can be publicly accessed (without needing a connection to the vm). How can I achieve this?

    I want to do the same thing you did with "http://api.vicgalle.net:5000/generate" and "http://api.vicgalle.net:5000/docs".

    Thanks,

    opened by Aryagm 1
  • API VM?

    API VM?

    Hi I wanted to host my own version of the api, where is the public one hosted? is it on a google cloud TPU VM? The ones ive seen here https://cloud.google.com/tpu/pricing are very expensive :D Is a TPU VM needed and the model won't be able to run on a normal GPU VM?

    Thanks!

    opened by jryebread 1
  • Raw text...

    Raw text...

    This is probably a very stupid question but whenever I run GPT-J I always get the full output:

    {'model': 'GPT-J-6B', 'compute_time': 1.2492187023162842, 'text': ' \n(and you\'ll be a slave)\n\n**_"I\'m not a robot, I\'m a human being."_**\n\n**_"I\'m not a robot, I\'m a human being."_**\n\n', 'prompt': 'AI will take over the world ', 'token_max_length': 50, 'temperature': 0.09, 'top_p': 0.9, 'stop_sequence': None}

    What parameter do I need to change so it only outputs the generated text?

    (and you'll be a slave) I'm not a robot, I'm a human being. I'm not a robot, I'm a human being.

    opened by Vilagamer999 1
  • Latency with TPU VM

    Latency with TPU VM

    Got things running on Google Clouds, really happy :). Was hoping for a little but of a speed increase, but computation time is the same and latency on the request seems to be the main delay. Did you experiment with firewalls and ports to improve things?

    opened by Ontopic 1
  • Version support for Huggingface GPT-J 6B

    Version support for Huggingface GPT-J 6B

    GPT-J Huggingface and streamlit style like by project-code py

    from transformers import AutoTokenizer, AutoModelForCausalLM

    tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-j-6B")

    model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-j-6B")

    opened by ghost 0
Releases(v0.3)
Owner
Víctor Gallego
Data scientist & predoc researcher
Víctor Gallego
This is my reading list for my PhD in AI, NLP, Deep Learning and more.

This is my reading list for my PhD in AI, NLP, Deep Learning and more.

Zhong Peixiang 156 Dec 21, 2022
An open-source NLP research library, built on PyTorch.

An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks. Quic

AI2 11.4k Jan 01, 2023
Text Classification in Turkish Texts with Bert

You can watch the details of the project on my youtube channel Project Interface Project Second Interface Goal= Correctly guessing the classification

42 Dec 31, 2022
nlp基础任务

NLP算法 说明 此算法仓库包括文本分类、序列标注、关系抽取、文本匹配、文本相似度匹配这五个主流NLP任务,涉及到22个相关的模型算法。 框架结构 文件结构 all_models ├── Base_line │   ├── __init__.py │   ├── base_data_process.

zuxinqi 23 Sep 22, 2022
Code for Editing Factual Knowledge in Language Models

KnowledgeEditor Code for Editing Factual Knowledge in Language Models (https://arxiv.org/abs/2104.08164). @inproceedings{decao2021editing, title={Ed

Nicola De Cao 86 Nov 28, 2022
The official code for “DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction”, ACM MM, Oral Paper, 2021.

Good news! Our new work exhibits state-of-the-art performances on DocUNet benchmark dataset: DocScanner: Robust Document Image Rectification with Prog

Hao Feng 231 Dec 26, 2022
this repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

uber-pickups-analysis Data Source: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city Information about data set The dataset contain

1 Nov 02, 2021
Implementation of "Adversarial purification with Score-based generative models", ICML 2021

Adversarial Purification with Score-based Generative Models by Jongmin Yoon, Sung Ju Hwang, Juho Lee This repository includes the official PyTorch imp

15 Dec 15, 2022
A minimal code for fairseq vq-wav2vec model inference.

vq-wav2vec inference A minimal code for fairseq vq-wav2vec model inference. Runs without installing the fairseq toolkit and its dependencies. Usage ex

Vladimir Larin 7 Nov 15, 2022
PyTorch Implementation of the paper Single Image Texture Translation for Data Augmentation

SITT The repo contains official PyTorch Implementation of the paper Single Image Texture Translation for Data Augmentation. Authors: Boyi Li Yin Cui T

Boyi Li 52 Jan 05, 2023
The SVO-Probes Dataset for Verb Understanding

The SVO-Probes Dataset for Verb Understanding This repository contains the SVO-Probes benchmark designed to probe for Subject, Verb, and Object unders

DeepMind 20 Nov 30, 2022
justCTF [*] 2020 challenges sources

justCTF [*] 2020 This repo contains sources for justCTF [*] 2020 challenges hosted by justCatTheFish. TLDR: Run a challenge with ./run.sh (requires Do

justCatTheFish 25 Dec 27, 2022
Repository for Project Insight: NLP as a Service

Project Insight NLP as a Service Contents Introduction Features Installation Setup and Documentation Project Details Demonstration Directory Details H

Abhishek Kumar Mishra 286 Dec 06, 2022
Adversarial Examples for Extreme Multilabel Text Classification

Adversarial Examples for Extreme Multilabel Text Classification The code is adapted from the source codes of BERT-ATTACK [1], APLC_XLNet [2], and Atte

1 May 14, 2022
Speech Recognition Database Management with python

Speech Recognition Database Management The main aim of this project is to recogn

Abhishek Kumar Jha 2 Feb 02, 2022
Extract rooms type, door, neibour rooms, rooms corners nad bounding boxes, and generate graph from rplan dataset

Housegan-data-reader House-GAN++ (data-reader) Code and instructions for converting rplan dataset (raster images) to housegan++ data format. House-GAN

Sepid Hosseini 13 Nov 24, 2022
Words-per-minute - A terminal app written in python utilizing the curses module that tests the user's ability to type

words-per-minute A terminal app written in python utilizing the curses module th

Tanim Islam 1 Jan 14, 2022
Collection of useful (to me) python scripts for interacting with napari

Napari scripts A collection of napari related tools in various state of disrepair/functionality. Browse_LIF_widget.py This module can be imported, for

5 Aug 15, 2022
This is the offline-training-pipeline for our project.

offline-training-pipeline This is the offline-training-pipeline for our project. We adopt the offline training and online prediction Machine Learning

0 Apr 22, 2022
Source code for the paper "TearingNet: Point Cloud Autoencoder to Learn Topology-Friendly Representations"

TearingNet: Point Cloud Autoencoder to Learn Topology-Friendly Representations Created by Jiahao Pang, Duanshun Li, and Dong Tian from InterDigital In

InterDigital 21 Dec 29, 2022