An All-MLP solution for Vision, from Google AI

Last update: Jan 06, 2023

Related tags

Overview

MLP Mixer - Pytorch

An All-MLP solution for Vision, from Google AI, in Pytorch.

No convolutions nor attention needed!

Yannic Kilcher video

Install

$ pip install mlp-mixer-pytorch

Usage

import torch
from mlp_mixer_pytorch import MLPMixer

model = MLPMixer(
    image_size = 256,
    patch_size = 16,
    dim = 512,
    depth = 12,
    num_classes = 1000
)

img = torch.randn(1, 3, 256, 256)
pred = model(img) # (1, 1000)

Citations

@misc{tolstikhin2021mlpmixer,
    title   = {MLP-Mixer: An all-MLP Architecture for Vision},
    author  = {Ilya Tolstikhin and Neil Houlsby and Alexander Kolesnikov and Lucas Beyer and Xiaohua Zhai and Thomas Unterthiner and Jessica Yung and Daniel Keysers and Jakob Uszkoreit and Mario Lucic and Alexey Dosovitskiy},
    year    = {2021},
    eprint  = {2105.01601},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}

PyTorch implementation of MLP-Mixer

PyTorch implementation of MLP-Mixer MLP-Mixer: an all-MLP architecture composed of alternate token-mixing and channel-mixing operations. The token-mix

33 Nov 27, 2022

Unofficial Implementation of MLP-Mixer in TensorFlow

mlp-mixer-tf Unofficial Implementation of MLP-Mixer [abs, pdf] in TensorFlow. Note: This project may have some bugs in it. I'm still learning how to i

24 Mar 23, 2022

Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch

Segformer - Pytorch Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch. Install $ pip install segformer-pytorch

208 Dec 25, 2022

Implementation of "A MLP-like Architecture for Dense Prediction"

A MLP-like Architecture for Dense Prediction (arXiv) Updates (22/07/2021) Initial release. Model Zoo We provide CycleMLP models pretrained on ImageNet

244 Dec 27, 2022

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

7.7k Jan 5, 2023

PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

PaddlePaddle Vision Transformers State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 🤖 PaddlePaddle Visual Transformers (PaddleViT or

1k Dec 28, 2022

Keras attention models including botnet,CoaT,CoAtNet,CMT,cotnet,halonet,resnest,resnext,resnetd,volo,mlp-mixer,resmlp,gmlp,levit

Keras_cv_attention_models Keras_cv_attention_models Usage Basic Usage Layers Model surgery AotNet ResNetD ResNeXt ResNetQ BotNet VOLO ResNeSt HaloNet

319 Dec 28, 2022

Unofficial Implementation of MLP-Mixer, Image Classification Model

MLP-Mixer Unoffical Implementation of MLP-Mixer, easy to use with terminal. Train and test easly. https://arxiv.org/abs/2105.01601 MLP-Mixer is an arc

6 Dec 5, 2022

MLP-Numpy - A simple modular implementation of Multi Layer Perceptron in pure Numpy.

MLP-Numpy A simple modular implementation of Multi Layer Perceptron in pure Numpy. I used the Iris dataset from scikit-learn library for the experimen

1 Jan 1, 2022

Comments

expansion_factor on tokens is actually a bottleneck in original codebase

Thanks for your implementation. In comparing your codebase to the author's implementation, I discovered that while you have a single expansion factor in your configuration, the authors have separate values - one for tokens and one for channels.

Specifically, their channels expansion factor is 4, but their tokens expansion factor is 0.5. (The hidden_dim is the base projection size). Note that they actually use a feature count, but I'm translating to the mechanism you use in this codebase.

Thus, when executing the MixerBlock, the tokens "expansion" is actually a bottleneck.

The parameters can be verified as well in Table 1 ("Specifications of Mixer Architectures") at the top of page 4 in version 4 (the current version as of Feb 14, 2022) of their paper.

I'm not suggesting that anything necessarily needs to change in your implementation. However, if you wanted to align your codebase to be able to fully replicate the author's work, you may consider allowing for two separate parameters - token_expansion_factor and channels_expansion_factor.

Thank you again for this work, and for all your contributions generally. You are a an incredible asset to the community.

opened by chazzmoney 1
Dall-E implementation

Amazing work! How difficult is it to implement mlp into Dall-E? As the whole idea around Dall-E evolves around attention layers and transformers, I wonder if this simpler model would enable smaller, equally capable models...

opened by robvanvolt 1

An All-MLP solution for Vision, from Google AI

Related tags

Overview

MLP Mixer - Pytorch

Install

Usage

Citations

You might also like...

PyTorch implementation of MLP-Mixer

Unofficial Implementation of MLP-Mixer in TensorFlow

Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch

Implementation of "A MLP-like Architecture for Dense Prediction"

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

Keras attention models including botnet,CoaT,CoAtNet,CMT,cotnet,halonet,resnest,resnext,resnetd,volo,mlp-mixer,resmlp,gmlp,levit

Unofficial Implementation of MLP-Mixer, Image Classification Model

MLP-Numpy - A simple modular implementation of Multi Layer Perceptron in pure Numpy.

Comments

expansion_factor on tokens is actually a bottleneck in original codebase

Dall-E implementation

Releases(0.1.1)

0.1.1(Feb 17, 2022)

0.1.0(Feb 17, 2022)

0.0.10(Jun 24, 2021)

0.0.9(Jun 24, 2021)

0.0.8(Jun 24, 2021)

0.0.7(May 30, 2021)

0.0.6(May 7, 2021)

0.0.5(May 7, 2021)

0.0.4(May 7, 2021)

0.0.3(May 5, 2021)

0.0.2b(May 5, 2021)

0.0.1(May 5, 2021)

Owner

Phil Wang

Automatic 2D-to-3D Video Conversion with CNNs

Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation

Code for the paper "SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness" (NeurIPS 2021)

Individual Tree Crown classification on WorldView-2 Images using Autoencoder -- Group 9 Weak learners - Final Project (Machine Learning 2020 Course)

This is a code repository for paper OODformer: Out-Of-Distribution Detection Transformer

YOLOv7 - Framework Beyond Detection

Code repo for "Cross-Scale Internal Graph Neural Network for Image Super-Resolution" (NeurIPS'20)

PROJECT - Az Residential Real Estate Analysis

A real world application of a Recurrent Neural Network on a binary classification of time series data

Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".

Code for "Causal autoregressive flows" - AISTATS, 2021

Paddle-Adversarial-Toolbox (PAT) is a Python library for Deep Learning Security based on PaddlePaddle.

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation

Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-Learning

Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Simple and Robust Loss Design for Multi-Label Learning with Missing Labels

Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

CVPR 2022 "Online Convolutional Re-parameterization"

Code for "Universal inference meets random projections: a scalable test for log-concavity"