MLP-Mixer: An all-MLP Architecture for Vision

This repo contains PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision.

Usage :

import torch
import numpy as np
from mlp-mixer import MLPMixer

img = torch.ones([1, 3, 224, 224])

model = MLPMixer(in_channels=3, image_size=224, patch_size=16, num_classes=1000,
                 dim=512, depth=8, token_dim=256, channel_dim=2048)

parameters = filter(lambda p: p.requires_grad, model.parameters())
parameters = sum([np.prod(p.size()) for p in parameters]) / 1_000_000
print('Trainable Parameters: %.3fM' % parameters)

out_img = model(img)

print("Shape of out :", out_img.shape)  # [B, in_channels, image_size, image_size]

Citation :

@misc{tolstikhin2021mlpmixer,
      title={MLP-Mixer: An all-MLP Architecture for Vision}, 
      author={Ilya Tolstikhin and Neil Houlsby and Alexander Kolesnikov and Lucas Beyer and Xiaohua Zhai and Thomas Unterthiner and Jessica Yung and Daniel Keysers and Jakob Uszkoreit and Mario Lucic and Alexey Dosovitskiy},
      year={2021},
      eprint={2105.01601},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement :

Some component borrowed from ViT code of @lucidrains repo : https://github.com/lucidrains/vit-pytorch

Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision

Related tags

Overview

MLP-Mixer: An all-MLP Architecture for Vision

Usage :

Citation :

Acknowledgement :

Owner

Rishikesh (ऋषिकेश)

Code and data for paper "Deep Photo Style Transfer"

Simple and understandable swin-transformer OCR project

A collection of semantic image segmentation models implemented in TensorFlow

This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

A Tensorflow based library for Time Series Modelling with Gaussian Processes

Learning to Reconstruct 3D Manhattan Wireframes from a Single Image

Code, final versions, and information on the Sparkfun Graphical Datasheets

Attentive Implicit Representation Networks (AIR-Nets)

🔥🔥High-Performance Face Recognition Library on PaddlePaddle & PyTorch🔥🔥

Train a deep learning net with OpenStreetMap features and satellite imagery.

Self-supervised spatio-spectro-temporal represenation learning for EEG analysis

This is a pytorch implementation for the BST model from Alibaba https://arxiv.org/pdf/1905.06874.pdf

[CIKM 2019] Code and dataset for "Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction"

Code for the paper "SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness" (NeurIPS 2021)

Keyhole Imaging: Non-Line-of-Sight Imaging and Tracking of Moving Objects Along a Single Optical Path

Blender Python - Node-based multi-line text and image flowchart

Supplementary code for the AISTATS 2021 paper "Matern Gaussian Processes on Graphs".

Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.

DIT is a DTLS MitM proxy implemented in Python 3. It can intercept, manipulate and suppress datagrams between two DTLS endpoints and supports psk-based and certificate-based authentication schemes (RSA + ECC).

A PyTorch-centric hybrid classical-quantum machine learning framework