Pytorch reimplementation of the Mixer (MLP-Mixer: An all-MLP Architecture for Vision)

Last update: Dec 08, 2022

Related tags

Overview

MLP-Mixer

Pytorch reimplementation of Google's repository for the MLP-Mixer (Not yet updated on the master branch) that was released with the paper MLP-Mixer: An all-MLP Architecture for Vision by Ilya Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Daniel Keysers, Jakob Uszkoreit, Mario Lucic, Alexey Dosovitskiy.

In this paper, the authors show a performance close to SotA in an image classification benchmark using MLP(Multi-layer perceptron) without using CNN and Transformer.

MLP-Mixer (Mixer for short) consists of per-patch linear embeddings, Mixer layers, and a classifier head. Mixer layers contain one token-mixing MLP and one channel-mixing MLP, each consisting of two fully-connected layers and a GELU nonlinearity. Other components include: skip-connections, dropout, and linear classifier head.

Usage

1. Download Pre-trained model (Google's Official Checkpoint)

Available models: Mixer-B_16, Mixer-L_16
- imagenet pre-train models
  - Mixer-B_16, Mixer-L_16
- imagenet-21k pre-train models
  - Mixer-B_16, Mixer-L_16

# imagenet pre-train
wget https://storage.googleapis.com/mixer_models/imagenet1k/{MODEL_NAME}.npz

# imagenet-21k pre-train
wget https://storage.googleapis.com/mixer_models/imagenet21k/{MODEL_NAME}.npz

2. Fine-tuning

python3 train.py --name cifar10-100_500 --model_type Mixer-B_16 --pretrained_dir checkpoint/Mixer-B_16.npz

Reproducing Mixer results

upstream	model	dataset	acc(official)
ImageNet	Mixer-B/16	cifar10	96.72
ImageNet	Mixer-L/16	cifar10	96.59
ImageNet-21k	Mixer-B/16	cifar10	96.82
ImageNet-21k	Mixer-L/16	cifar10	96.34

Reference

Google's Vision Transformer and MLP-Mixer

Citations

@article{tolstikhin2021,
  title={MLP-Mixer: An all-MLP Architecture for Vision},
  author={Tolstikhin, Ilya and Houlsby, Neil and Kolesnikov, Alexander and Beyer, Lucas and Zhai, Xiaohua and Unterthiner, Thomas and Yung, Jessica and Keysers, Daniel and Uszkoreit, Jakob and Lucic, Mario and Dosovitskiy, Alexey},
  journal={arXiv preprint arXiv:2105.01601},
  year={2021}
}

Pytorch reimplementation of the Mixer (MLP-Mixer: An all-MLP Architecture for Vision)

Related tags

Overview

MLP-Mixer

Usage

1. Download Pre-trained model (Google's Official Checkpoint)

2. Fine-tuning

Reproducing Mixer results

Reference

Citations

Owner

Eunkwang Jeon

Management Dashboard for Torchserve

Neural Koopman Lyapunov Control

Image Restoration Toolbox (PyTorch). Training and testing codes for DPIR, USRNet, DnCNN, FFDNet, SRMD, DPSR, BSRGAN, SwinIR

Implementation of UNet on the Joey ML framework

SMIS - Semantically Multi-modal Image Synthesis(CVPR 2020)

This repository collects project-relevant Isabelle/HOL formalizations.

Repository sharing code and the model for the paper "Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes"

Workshop Materials Delivered on 28/02/2022

CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms

SafePicking: Learning Safe Object Extraction via Object-Level Mapping, ICRA 2022

Source code for our CVPR 2019 paper - PPGNet: Learning Point-Pair Graph for Line Segment Detection

This repository contains PyTorch code for Robust Vision Transformers.

TensorFlow implementation of ENet

Offical implementation of Shunted Self-Attention via Multi-Scale Token Aggregation

PyTorch implementation for "HyperSPNs: Compact and Expressive Probabilistic Circuits", NeurIPS 2021

Voice Conversion Using Speech-to-Speech Neuro-Style Transfer

[ICCV-2021] An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

Pytorch implementation of the paper Time-series Generative Adversarial Networks

the code of the paper: Recurrent Multi-view Alignment Network for Unsupervised Surface Registration (CVPR 2021)

An experimentation and research platform to investigate the interaction of automated agents in an abstract simulated network environments.