GPT, but made only out of gMLPs

Last update: Dec 01, 2022

Overview

GPT - gMLP

This repository will attempt to crack long context autoregressive language modeling (GPT) using variations of gMLPs. Specifically, it will contain a variant that does gMLP for local sliding windows. The hope is to be able to stretch a single GPU to be able to train context lengths of 4096 and above efficiently and well.

GPT is technically a misnomer now, since there will be no attention (transformer) at all contained in the architecture.

Install

$ pip install g-mlp-gpt

Usage

import torch
from g_mlp_gpt import gMLPGPT

model = gMLPGPT(
    num_tokens = 20000,
    dim = 512,
    depth = 4,
    seq_len = 1024,
    window_size = (128, 256, 512, 1024) # window sizes for each depth
)

x = torch.randint(0, 20000, (1, 1000))
logits = model(x) # (1, 1000, 20000)

16k context length

import torch
from g_mlp_gpt import gMLPGPT

model = gMLPGPT(
    num_tokens = 20000,
    dim = 512,
    seq_len = 16384,
    depth = 8,
    reversible = True,
    window = (128, 128, 256, 512, 1024, 1024, 2048, 2048, 4096, 4096, 8192, 8192),
    axial = (1, 1, 1, 1, 1, 1, 2, 2, 4, 4, 8, 8)
).cuda()

x = torch.randint(0, 20000, (1, 16384)).cuda()
logits = model(x) # (1, 16384, 20000)

Citations

@misc{liu2021pay,
    title   = {Pay Attention to MLPs}, 
    author  = {Hanxiao Liu and Zihang Dai and David R. So and Quoc V. Le},
    year    = {2021},
    eprint  = {2105.08050},
    archivePrefix = {arXiv},
    primaryClass = {cs.LG}
}

AI-Bot - 一个基于watermelon改造的OpenAI-GPT-2的智能机器人

AI-Bot 一个基于watermelon改造的OpenAI-GPT-2的智能机器人在Binder上直接运行测试目前有两种实现方式 TF2的GPT-2 TF

9 Nov 16, 2022

Building Ellee — A GPT-3 and Computer Vision Powered Talking Robotic Teddy Bear With Human Level Conversation Intelligence

Using an object detection and facial recognition system built on MobileNetSSDV2 and Dlib and running on an NVIDIA Jetson Nano, a GPT-3 model, Google Speech Recognition, Amazon Polly and servo motors, I built Ellee - a robotic teddy bear who can move her head and converse naturally.

24 Oct 26, 2022

MAGMA - a GPT-style multimodal model that can understand any combination of images and language

MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning Authors repo (alphabetical) Constantin (CoEich), Mayukh (Mayukh

331 Jan 3, 2023

Simple, but essential Bayesian optimization package

BayesO: A Bayesian optimization framework in Python Simple, but essential Bayesian optimization package. http://bayeso.org Online documentation Instal

74 Dec 5, 2022

Like a cowsay but without cows!

Foxsay This is a simple program that generates pictures of a cute fox with a message. It is like a cowsay but without cows! Fox girls are better! Usag

28 Feb 20, 2022

Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.

Algo-ScriptML Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The goal of this project is not t

81 Nov 26, 2022

Like Dirt-Samples, but cleaned up

Clean-Samples Like Dirt-Samples, but cleaned up, with clear provenance and license info (generally a permissive creative commons licence but check the

39 Nov 30, 2022

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

49 Dec 22, 2022

A Pytorch implementation of CVPR 2021 paper "RSG: A Simple but Effective Module for Learning Imbalanced Datasets"

RSG: A Simple but Effective Module for Learning Imbalanced Datasets (CVPR 2021) A Pytorch implementation of our CVPR 2021 paper "RSG: A Simple but Eff

120 Dec 12, 2022

GPT, but made only out of gMLPs

Related tags

Overview

GPT - gMLP

Install

Usage

Citations

You might also like...

AI-Bot - 一个基于watermelon改造的OpenAI-GPT-2的智能机器人

Building Ellee — A GPT-3 and Computer Vision Powered Talking Robotic Teddy Bear With Human Level Conversation Intelligence

MAGMA - a GPT-style multimodal model that can understand any combination of images and language

Simple, but essential Bayesian optimization package

Like a cowsay but without cows!

Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.

Like Dirt-Samples, but cleaned up

The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

A Pytorch implementation of CVPR 2021 paper "RSG: A Simple but Effective Module for Learning Imbalanced Datasets"

Releases(0.0.15)

0.0.15(May 25, 2021)

0.0.14(May 25, 2021)

0.0.12(May 24, 2021)

0.0.11(May 23, 2021)

0.0.10(May 23, 2021)

0.0.9(May 21, 2021)

0.0.8(May 21, 2021)

0.0.7(May 21, 2021)

0.0.6(May 21, 2021)

0.0.5(May 20, 2021)

0.0.4(May 20, 2021)

0.0.3(May 20, 2021)

0.0.2(May 20, 2021)

0.0.1(May 20, 2021)

Owner

Phil Wang

optimization routines for hyperparameter tuning

Java and SHACL code commented in the paper "Towards compliance checking in reified I/O logic via SHACL" submitted to ICAIL 2021

Code for the paper "Curriculum Dropout", ICCV 2017

pybaum provides tools to work with pytrees which is a concept burrowed from JAX.

Evaluation framework for testing segmentation networks in PyTorch

This is the repo for the paper `SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization'. (published in Bioinformatics'21)

Message Passing on Cell Complexes

End-to-End Referring Video Object Segmentation with Multimodal Transformers

A simple image/video to Desmos graph converter run locally

A Novel Incremental Learning Driven Instance Segmentation Framework to Recognize Highly Cluttered Instances of the Contraband Items

Weakly Supervised Segmentation with Tensorflow. Implements instance segmentation as described in Simple Does It: Weakly Supervised Instance and Semantic Segmentation, by Khoreva et al. (CVPR 2017).

EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers

The official codes of "Semi-supervised Models are Strong Unsupervised Domain Adaptation Learners".

This repository contains code and data for "On the Multimodal Person Verification Using Audio-Visual-Thermal Data"

lightweight python wrapper for vowpal wabbit

DenseNet Implementation in Keras with ImageNet Pretrained Models

Distributed Evolutionary Algorithms in Python

Film review classification

chen2020iros: Learning an Overlap-based Observation Model for 3D LiDAR Localization.

A novel benchmark dataset for Monocular Layout prediction