A repository to run gpt-j-6b on low vram machines (4.2 gb minimum vram for 2000 token context, 3.5 gb for 1000 token context). Model loading takes 12gb free ram.

Last update: Dec 25, 2022

Overview

Basic-UI-for-GPT-J-6B-with-low-vram

A repository to run GPT-J-6B on low vram systems by using both ram, vram and pinned memory.

There seem to be some issues with the weights in the drive link. There seems to be some performance loss, most likely because of poor 16 bit conversion.

How to run :

Use - pip install git+https://github.com/finetuneanon/[email protected]
Use the link - https://drive.google.com/file/d/1tboTvohQifN6f1JiSV8hnciyNKvj9pvm/view?usp=sharing to dowload the model that has been saved as described here - https://github.com/arrmansa/saving-and-loading-large-models-pytorch

Timing (2000 token context)

1

system -

16 gb ddr4 ram . 1070 8gb gpu.
23 blocks on ram (ram_blocks = 23) out of which 18 are on shared/pinned memory (max_shared_ram_blocks = 18).

timing -

single run of the model(inputs) takes 6.5 seconds.
35 seconds to generate 25 tokens at 2000 context. (1.4 seconds/token)

2

system -

16 gb ddr4 ram . 1060 6gb gpu.
26 blocks on ram (ram_blocks = 26) out of which 18 are on shared/pinned memory (max_shared_ram_blocks = 18).

timing -

40 seconds to generate 25 tokens at 2000 context. (1.6 seconds/token)

A repository to run gpt-j-6b on low vram machines (4.2 gb minimum vram for 2000 token context, 3.5 gb for 1000 token context). Model loading takes 12gb free ram.

Related tags

Overview

Basic-UI-for-GPT-J-6B-with-low-vram

There seem to be some issues with the weights in the drive link. There seems to be some performance loss, most likely because of poor 16 bit conversion.

How to run :

Timing (2000 token context)

1

system -

timing -

2

system -

timing -

Owner

MRC approach for Aspect-based Sentiment Analysis (ABSA)

PUA Programming Language written in Python.

Indonesia spellchecker with python

The (extremely) naive sentiment classification function based on NBSVM trained on wisesight_sentiment

Words-per-minute - A terminal app written in python utilizing the curses module that tests the user's ability to type

A Japanese tokenizer based on recurrent neural networks

[KBS] Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks

Code release for "COTR: Correspondence Transformer for Matching Across Images"

Text-Based zombie apocalyptic decision-making game in Python

VampiresVsWerewolves - Our Implementation of a MiniMax algorithm with alpha beta pruning in the context of an in-class competition

Applied Natural Language Processing in the Enterprise - An O'Reilly Media Publication

Semantic search through a vectorized Wikipedia (SentenceBERT) with the Weaviate vector search engine

A demo for end-to-end English and Chinese text spotting using ABCNet.

QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries

ACL'2021: Learning Dense Representations of Phrases at Scale

Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow

Application to help find best train itinerary, uses speech to text, has a spam filter to segregate invalid inputs, NLP and Pathfinding algos.

Creating an LSTM model to generate music

Python module (C extension and plain python) implementing Aho-Corasick algorithm

Code for Text Prior Guided Scene Text Image Super-Resolution