A GPT, made only of MLPs, in Jax

Last update: Sep 27, 2022

Overview

MLP GPT - Jax (wip)

A GPT, made only of MLPs, in Jax. The specific MLP to be used are gMLPs with the Spatial Gating Units.

Install

$ pip install mlp-gpt-jax

Usage

from jax import random, numpy as np
from mlp_gpt_jax import MLPGpt

gpt = MLPGpt(
    num_tokens = 20000,
    dim = 512,
    depth = 6,
    seq_len = 512
)

key    = random.PRNGKey(0)
seq    = random.randint(key, (512,), 0, 20000)

params = gpt.init(key, seq)
logits = gpt.apply(params, seq) # (512, 20000)

Citations

@misc{liu2021pay,
    title   = {Pay Attention to MLPs}, 
    author  = {Hanxiao Liu and Zihang Dai and David R. So and Quoc V. Le},
    year    = {2021},
    eprint  = {2105.08050},
    archivePrefix = {arXiv},
    primaryClass = {cs.LG}
}

Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

Storium GPT-2 Models This is the official repository for the GPT-2 models described in the EMNLP 2020 paper [STORIUM: A Dataset and Evaluation Platfor

27 Dec 20, 2022

Training data extraction on GPT-2

Training data extraction from GPT-2 This repository contains code for extracting training data from GPT-2, following the approach outlined in the foll

62 Dec 7, 2022

Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

GPT2-Pytorch with Text-Generator Better Language Models and Their Implications Our model, called GPT-2 (a successor to GPT), was trained simply to pre

775 Jan 8, 2023

ChatBot-Pytorch - A GPT-2 ChatBot implemented using Pytorch and Huggingface-transformers

ChatBot-Pytorch A GPT-2 ChatBot implemented using Pytorch and Huggingface-transf

42 Dec 9, 2022

AI-Bot - 一个基于watermelon改造的OpenAI-GPT-2的智能机器人

AI-Bot 一个基于watermelon改造的OpenAI-GPT-2的智能机器人在Binder上直接运行测试目前有两种实现方式 TF2的GPT-2 TF

9 Nov 16, 2022

Building Ellee — A GPT-3 and Computer Vision Powered Talking Robotic Teddy Bear With Human Level Conversation Intelligence

Using an object detection and facial recognition system built on MobileNetSSDV2 and Dlib and running on an NVIDIA Jetson Nano, a GPT-3 model, Google Speech Recognition, Amazon Polly and servo motors, I built Ellee - a robotic teddy bear who can move her head and converse naturally.

24 Oct 26, 2022

MAGMA - a GPT-style multimodal model that can understand any combination of images and language

MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning Authors repo (alphabetical) Constantin (CoEich), Mayukh (Mayukh

331 Jan 3, 2023

FedJAX is a library for developing custom Federated Learning (FL) algorithms in JAX.

FedJAX: Federated learning with JAX What is FedJAX? FedJAX is a library for developing custom Federated Learning (FL) algorithms in JAX. FedJAX priori

208 Dec 14, 2022

Flax is a neural network ecosystem for JAX that is designed for flexibility.

Flax: A neural network library and ecosystem for JAX designed for flexibility Overview | Quick install | What does Flax look like? | Documentation See

3.9k Jan 2, 2023

Comments

mistake in parameter initialization

floor division will always return 0 :(

https://github.com/lucidrains/mlp-gpt-jax/blob/c8a6d7738562e44d3c0b3018c83ae577f7931e78/mlp_gpt_jax/mlp_gpt_jax.py#L75

opened by guyd1995 1

Releases(0.0.19)

0.0.19(Jun 23, 2021)

Source code(tar.gz)
Source code(zip)
0.0.18(Jun 22, 2021)

Source code(tar.gz)
Source code(zip)
0.0.17(Jun 22, 2021)

Source code(tar.gz)
Source code(zip)
0.0.16(Jun 3, 2021)

Source code(tar.gz)
Source code(zip)
0.0.15(Jun 3, 2021)

Source code(tar.gz)
Source code(zip)
0.0.14(Jun 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.12(Jun 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.11(Jun 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.10(Jun 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.9(Jun 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.8(May 29, 2021)

Source code(tar.gz)
Source code(zip)
0.0.7(May 27, 2021)

Source code(tar.gz)
Source code(zip)
0.0.6(May 26, 2021)

Source code(tar.gz)
Source code(zip)
0.0.5(May 25, 2021)

Source code(tar.gz)
Source code(zip)
0.0.4(May 23, 2021)

Source code(tar.gz)
Source code(zip)
0.0.3(May 22, 2021)

Source code(tar.gz)
Source code(zip)
0.0.2(May 21, 2021)

Source code(tar.gz)
Source code(zip)
0.0.1(May 21, 2021)

Source code(tar.gz)
Source code(zip)

Owner

Phil Wang

Working with Attention

GitHub Repository

PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection?

PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park*, Rares Ambrus*, Vitor Guizilini, Jie Li, and Adrien Gaidon.

364 Dec 27, 2022

Turning pixels into virtual points for multimodal 3D object detection.

Multimodal Virtual Point 3D Detection Turning pixels into virtual points for multimodal 3D object detection. Multimodal Virtual Point 3D Detection, Ti

204 Jan 08, 2023

Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 130+ Indicators

Pandas TA - A Technical Analysis Library in Python 3 Pandas Technical Analysis (Pandas TA) is an easy to use library that leverages the Pandas package

3.2k Jan 09, 2023

RaftMLP: How Much Can Be Done Without Attention and with Less Spatial Locality?

RaftMLP RaftMLP: How Much Can Be Done Without Attention and with Less Spatial Locality? By Yuki Tatsunami and Masato Taki (Rikkyo University) [arxiv]

20 Aug 31, 2022

Semantic Segmentation Suite in TensorFlow

Semantic Segmentation Suite in TensorFlow. Implement, train, and test new Semantic Segmentation models easily!

2.5k Jan 06, 2023

Python implementation of O-OFDMNet, a deep learning-based optical OFDM system,

O-OFDMNet This includes Python implementation of O-OFDMNet, a deep learning-based optical OFDM system, which uses neural networks for signal processin

4 Sep 09, 2022

Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

DynaBOA Code repositoty for the paper: Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation Shanyan Guan, Jingwei Xu, Michell

197 Jan 07, 2023

the code of the paper: Recurrent Multi-view Alignment Network for Unsupervised Surface Registration (CVPR 2021)

RMA-Net This repo is the implementation of the paper: Recurrent Multi-view Alignment Network for Unsupervised Surface Registration (CVPR 2021). Paper

205 Nov 09, 2022

Repositório para arquivos sobre o Módulo 1 do curso Top Coders da Let's Code + Safra

850-Safra-DS-ModuloI Repositório para arquivos sobre o Módulo 1 do curso Top Coders da Let's Code + Safra Para aprender mais Git https://learngitbranc

7 Dec 10, 2022

Bringing sanity to world of messed-up data

Sanitize sanitize is a Python module for making sure various things (e.g. HTML) are safe to use. It was originally written by Mark Pilgrim and is dist

63 Oct 26, 2021

ComPhy: Compositional Physical Reasoning ofObjects and Events from Videos

ComPhy This repository holds the code for the paper. ComPhy: Compositional Physical Reasoning ofObjects and Events from Videos, (Under review) PDF Pro

29 Dec 29, 2022

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Source Code

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Trevor Ablett*, Bryan Chan*,

8 Sep 14, 2022

A Web API for automatic background removal using Deep Learning. App is made using Flask and deployed on Heroku.

Automatic_Background_Remover A Web API for automatic background removal using Deep Learning. App is made using Flask and deployed on Heroku. 👉 https:

16 Oct 29, 2022

Regression Metrics Calculation Made easy for tensorflow2 and scikit-learn

Regression Metrics Installation To install the package from the PyPi repository you can execute the following command: pip install regressionmetrics I

11 Dec 16, 2022

Demonstrational Session git repo for H SAF User Workshop (28/1)

5th H SAF User Workshop The 5th H SAF User Workshop supported by EUMeTrain will be held in online in January 24-28 2022. This repository contains inst

4 Aug 04, 2022

Drone-based Joint Density Map Estimation, Localization and Tracking with Space-Time Multi-Scale Attention Network

DroneCrowd Paper Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark. Introduction This paper proposes a space-time multi-scale atte

98 Nov 16, 2022

An extremely simple, intuitive, hardware-friendly, and well-performing network structure for LiDAR semantic segmentation on 2D range image. IROS21

FIDNet_SemanticKITTI Motivation Implementing complicated network modules with only one or two points improvement on hardware is tedious. So here we pr

54 Dec 12, 2022

A GPT, made only of MLPs, in Jax

Related tags

Overview

MLP GPT - Jax (wip)

Install

Usage

Citations

You might also like...

Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

Training data extraction on GPT-2

Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

ChatBot-Pytorch - A GPT-2 ChatBot implemented using Pytorch and Huggingface-transformers

AI-Bot - 一个基于watermelon改造的OpenAI-GPT-2的智能机器人

Building Ellee — A GPT-3 and Computer Vision Powered Talking Robotic Teddy Bear With Human Level Conversation Intelligence

MAGMA - a GPT-style multimodal model that can understand any combination of images and language

FedJAX is a library for developing custom Federated Learning (FL) algorithms in JAX.

Flax is a neural network ecosystem for JAX that is designed for flexibility.

Comments

mistake in parameter initialization

Releases(0.0.19)

0.0.19(Jun 23, 2021)

0.0.18(Jun 22, 2021)

0.0.17(Jun 22, 2021)

0.0.16(Jun 3, 2021)

0.0.15(Jun 3, 2021)

0.0.14(Jun 2, 2021)

0.0.12(Jun 2, 2021)

0.0.11(Jun 2, 2021)

0.0.10(Jun 2, 2021)

0.0.9(Jun 2, 2021)

0.0.8(May 29, 2021)

0.0.7(May 27, 2021)

0.0.6(May 26, 2021)

0.0.5(May 25, 2021)

0.0.4(May 23, 2021)

0.0.3(May 22, 2021)

0.0.2(May 21, 2021)

0.0.1(May 21, 2021)