PyTorch implementation of Pay Attention to MLPs

Last update: Dec 13, 2022

Overview

gMLP

PyTorch implementation of Pay Attention to MLPs.

Quickstart

Clone this repository.

git clone https://github.com/jaketae/g-mlp.git

Navigate to the cloned directory. You can use the barebone gMLP model via

>>> from g_mlp import gMLP
>>> model = gMLP()

By default, the model comes with the following parameters:

gMLP(
    d_model=256,
    d_ffn=512,
    seq_len=256,
    num_layers=6,
)

Usage

The repository also contains gMLP models specifically for language modeling and image classification.

NLP

gMLPForLanguageModeling shares the same default parameters as gMLP, with num_tokens=10000 as an added parameter that represents the size of the token embedding table.

>>> from g_mlp import gMLPForLanguageModeling
>>> model = gMLPForLanguageModeling()
>>> tokens = torch.randint(0, 10000, (8, 256))
>>> model(tokens).shape
torch.Size([8, 256, 256])

Computer Vision

gMLPForImageClassification is a ViT-esque version of gMLP that includes a patch creating layer and a final classification head.

>>> from g_mlp import gMLPForImageClassification
>>> model = gMLPForImageClassification()
>>> images = torch.randn(8, 3, 256, 256)
>>> model(images).shape
torch.Size([8, 1000])

Summary

The authors of the paper present gMLP, an an attention-free all-MLP architecture based on spatial gating units. gMLP achieves parity with transformer models such as ViT and BERT on language and vision downstream tasks. The authors also show that gMLP scales with increased data and number of parameters, suggesting that self-attention is not a necessary component for designing performant models.

PyTorch implementation of Pay Attention to MLPs

Related tags

Overview

gMLP

Quickstart

Usage

NLP

Computer Vision

Summary

Resources

Owner

Jake Tae

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

StarGAN - Official PyTorch Implementation (CVPR 2018)

Semantic Image Synthesis with SPADE

code for TCL: Vision-Language Pre-Training with Triple Contrastive Learning, CVPR 2022

nnFormer: Interleaved Transformer for Volumetric Segmentation Code for paper "nnFormer: Interleaved Transformer for Volumetric Segmentation "

Hyperbolic Image Segmentation, CVPR 2022

A library for differentiable nonlinear optimization.

yolox_backbone is a deep-learning library and is a collection of YOLOX Backbone models.

[SIGIR22] Official PyTorch implementation for "CORE: Simple and Effective Session-based Recommendation within Consistent Representation Space".

Style-based Neural Drum Synthesis with GAN inversion

Experiments and code to generate the GINC small-scale in-context learning dataset from "An Explanation for In-context Learning as Implicit Bayesian Inference"

Jupyter notebooks for the code samples of the book "Deep Learning with Python"

This code provides a PyTorch implementation for OTTER (Optimal Transport distillation for Efficient zero-shot Recognition), as described in the paper.

A robotic arm that mimics hand movement through MediaPipe tracking.

Pydantic models for pywttr and aiopywttr.

LAMDA: Label Matching Deep Domain Adaptation

Caffe-like explicit model constructor. C(onfig)Model

CS5242_2021 - Neural Networks and Deep Learning, NUS CS5242, 2021

Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language

DimReductionClustering - Dimensionality Reduction + Clustering + Unsupervised Score Metrics