Implementation of ResMLP, an all MLP solution to image classification, in Pytorch

Overview

ResMLP - Pytorch

Implementation of ResMLP, an all MLP solution to image classification out of Facebook AI, in Pytorch

Install

$ pip install res-mlp-pytorch

Usage

import torch
from res_mlp_pytorch import ResMLP

model = ResMLP(
    image_size = 256,
    patch_size = 16,
    dim = 512,
    depth = 12,
    num_classes = 1000
)

img = torch.randn(1, 3, 256, 256)
pred = model(img) # (1, 1000)

Citations

@misc{touvron2021resmlp,
    title   = {ResMLP: Feedforward networks for image classification with data-efficient training}, 
    author  = {Hugo Touvron and Piotr Bojanowski and Mathilde Caron and Matthieu Cord and Alaaeldin El-Nouby and Edouard Grave and Armand Joulin and Gabriel Synnaeve and Jakob Verbeek and Hervé Jégou},
    year    = {2021},
    eprint  = {2105.03404},
    archivePrefix = {arXiv},
    primaryClass = {cs.CV}
}
You might also like...
Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch
Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch

Segformer - Pytorch Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch. Install $ pip install segformer-pytorch

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

Pytorch implementation of MLP-Mixer with loading pre-trained models.

MLP-Mixer-Pytorch PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision with the function of loading official ImageNet pre-trained p

Image Classification - A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

A research on image classification and auto insurance claim prediction, a systematic experiments on modeling techniques and approaches

MLP-Like Vision Permutator for Visual Recognition (PyTorch)
MLP-Like Vision Permutator for Visual Recognition (PyTorch)

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition (arxiv) This is a Pytorch implementation of our paper. We present Vision

Xview3 solution - XView3 challenge, 2nd place solution
Xview3 solution - XView3 challenge, 2nd place solution

Xview3, 2nd place solution https://iuu.xview.us/ test split aggregate score publ

Unofficial Implementation of MLP-Mixer in TensorFlow
Unofficial Implementation of MLP-Mixer in TensorFlow

mlp-mixer-tf Unofficial Implementation of MLP-Mixer [abs, pdf] in TensorFlow. Note: This project may have some bugs in it. I'm still learning how to i

Implementation of
Implementation of "A MLP-like Architecture for Dense Prediction"

A MLP-like Architecture for Dense Prediction (arXiv) Updates (22/07/2021) Initial release. Model Zoo We provide CycleMLP models pretrained on ImageNet

MLP-Numpy - A simple modular implementation of Multi Layer Perceptron in pure Numpy.

MLP-Numpy A simple modular implementation of Multi Layer Perceptron in pure Numpy. I used the Iris dataset from scikit-learn library for the experimen

Comments
  • torch dataset example

    torch dataset example

    I wrote this examples with a data loader:

    import os
    import natsort
    from PIL import Image
    import torch
    import torchvision.transforms as T
    from res_mlp_pytorch.res_mlp_pytorch import ResMLP
    
    class LPCustomDataSet(torch.utils.data.Dataset):
        '''
            Naive Torch Image Dataset Loader
            with support for Image loading errors
            and Image resizing
        '''
        def __init__(self, main_dir, transform):
            self.main_dir = main_dir
            self.transform = transform
            all_imgs = os.listdir(main_dir)
            self.total_imgs = natsort.natsorted(all_imgs)
    
        def __len__(self):
            return len(self.total_imgs)
    
        def __getitem__(self, idx):
            img_loc = os.path.join(self.main_dir, self.total_imgs[idx])
            try:
                image = Image.open(img_loc).convert("RGB")
                tensor_image = self.transform(image)
                return tensor_image
            except:
                pass
                return None
    
        @classmethod
        def collate_fn(self, batch):
            '''
                Collate filtering not None images
            '''
            batch = list(filter(lambda x: x is not None, batch))
            return torch.utils.data.dataloader.default_collate(batch)
    
        @classmethod
        def transform(self,img):
            '''
                Naive image resizer
            '''
            transform = T.Compose([
                T.Resize(256),
                T.CenterCrop(224),
                T.ToTensor(),
                T.Normalize(
                    mean=[0.485, 0.456, 0.406],
                    std=[0.229, 0.224, 0.225]
                )
            ])
            return transform(img)
    

    to feed ResMLP:

    model = ResMLP(
        image_size = 256,
        patch_size = 16,
        dim = 512,
        depth = 12,
        num_classes = 1000
    )
    batch_size = 2
    my_dataset = LPCustomDataSet(os.path.join(os.path.dirname(
        os.path.abspath(__file__)), 'data'), transform=LPCustomDataSet.transform)
    train_loader = torch.utils.data.DataLoader(my_dataset , batch_size=batch_size, shuffle=False, 
                                   num_workers=4, drop_last=True, collate_fn=LPCustomDataSet.collate_fn)
    for idx, img in enumerate(train_loader):
        pred = model(img) # (1, 1000)
        print(idx, img.shape, pred.shape
    

    But I get this error

    RuntimeError: Given groups=1, weight of size [256, 256, 1], expected input[1, 196, 512] to have 256 channels, but got 196 channels instead
    

    not sure if LPCustomDataSet.transform has the correct for the input image

    opened by loretoparisi 3
  • add dropout and CIFAR100 example notebook

    add dropout and CIFAR100 example notebook

    • According to ResMLP paper, it appears that dropout layer has been implemented in Machine translation when using ResMLP.
    We use Adagrad with learning rate 0.2, 32k steps of linear warmup, label smoothing 0.1, dropout rate 0.15 for En-De and 0.1 for En-Fr.
    
    • Since MLP literatures often mention that MLP is susceptible to overfitting, which is one of the reason why weight decay is so high, implementing dropout will be reasonable choice of regularization.

    Open in Colab | 🔗 Wandb Log

    • Above is my simple experimentation on CIFAR100 dataset, with three different dropout rates: [0.0, 0.25, 0.5].
    • Higher dropout yielded better test metrics(loss, acc1 and acc5).
    opened by snoop2head 0
  • What learning rate/scheduler/optimizer are suitable for training mlp-mixer?

    What learning rate/scheduler/optimizer are suitable for training mlp-mixer?

    Thanks for your codes!

    I find it is very important to set suitable lr/scheduler/optimizer for training res-mlp models. In my experiments with a small dataset, the classification performance is very poor when I train models with lr=1e-3 or 1e-4, weight-decay=05e-4, scheduler=WarmupCosineLrScheduler, optim='sgd'. The results increase remarkably when lr=5e-3, weight-decay=0.2, scheduler=WarmupCosineLrScheduler, optim='lamb'.

    While the results are still much lower than CNN models with comparable params. trained from scratch. Could you provide any suggestions for training res-mlp?

    opened by QiushiYang 0
Releases(0.0.6)
Owner
Phil Wang
Working with Attention.
Phil Wang
VisionKG: Vision Knowledge Graph

VisionKG: Vision Knowledge Graph Official Repository of VisionKG by Anh Le-Tuan, Trung-Kien Tran, Manh Nguyen-Duc, Jicheng Yuan, Manfred Hauswirth and

Continuous Query Evaluation over Linked Stream (CQELS) 9 Jun 23, 2022
A simple image/video to Desmos graph converter run locally

Desmos Bezier Renderer A simple image/video to Desmos graph converter run locally Sample Result Setup Install dependencies apt update apt install git

Kevin JY Cui 339 Dec 23, 2022
Constructing interpretable quadratic accuracy predictors to serve as an objective function for an IQCQP problem that represents NAS under latency constraints and solve it with efficient algorithms.

IQNAS: Interpretable Integer Quadratic programming Neural Architecture Search Realistic use of neural networks often requires adhering to multiple con

0 Oct 24, 2021
This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

ObjProp Introduction This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Insta

Anirudh S Chakravarthy 6 May 03, 2022
CTRMs: Learning to Construct Cooperative Timed Roadmaps for Multi-agent Path Planning in Continuous Spaces

CTRMs: Learning to Construct Cooperative Timed Roadmaps for Multi-agent Path Planning in Continuous Spaces This is a repository for the following pape

17 Oct 13, 2022
Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

Optimum Transformers Accelerated NLP pipelines for fast inference 🚀 on CPU and GPU. Built with 🤗 Transformers, Optimum and ONNX runtime. Installatio

Aleksey Korshuk 115 Dec 16, 2022
This project is a loose implementation of paper "Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach"

Stock Market Buy/Sell/Hold prediction Using convolutional Neural Network This repo is an attempt to implement the research paper titled "Algorithmic F

Asutosh Nayak 136 Dec 28, 2022
The implementation of DeBERTa

DeBERTa: Decoding-enhanced BERT with Disentangled Attention This repository is the official implementation of DeBERTa: Decoding-enhanced BERT with Dis

Microsoft 1.2k Jan 06, 2023
Train a state-of-the-art yolov3 object detector from scratch!

TrainYourOwnYOLO: Building a Custom Object Detector from Scratch This repo let's you train a custom image detector using the state-of-the-art YOLOv3 c

AntonMu 616 Jan 08, 2023
Source code for "Understanding Knowledge Integration in Language Models with Graph Convolutions"

Graph Convolution Simulator (GCS) Source code for "Understanding Knowledge Integration in Language Models with Graph Convolutions" Requirements: PyTor

yifan 10 Oct 18, 2022
A PyTorch implementation of "Signed Graph Convolutional Network" (ICDM 2018).

SGCN ⠀ A PyTorch implementation of Signed Graph Convolutional Network (ICDM 2018). Abstract Due to the fact much of today's data can be represented as

Benedek Rozemberczki 251 Nov 30, 2022
An implementation of "Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport"

Optex An implementation of Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport for TU Delft CS4240. You c

Hans Brouwer 33 Jan 05, 2023
ANN model for prediction a spatio-temporal distribution of supercooled liquid in mixed-phase clouds using Doppler cloud radar spectra.

VOODOO Revealing supercooled liquid beyond lidar attenuation Explore the docs » Report Bug · Request Feature Table of Contents About The Project Built

remsens-lim 2 Apr 28, 2022
Monocular 3D Object Detection: An Extrinsic Parameter Free Approach (CVPR2021)

Monocular 3D Object Detection: An Extrinsic Parameter Free Approach (CVPR2021) Yunsong Zhou, Yuan He, Hongzi Zhu, Cheng Wang, Hongyang Li, Qinhong Jia

Yunsong Zhou 51 Dec 14, 2022
A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data

A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data Overview Clustering analysis is widely utilized in single-cell RNA-seque

AI-Biomed @NSCC-gz 3 May 08, 2022
SARS-Cov-2 Recombinant Finder for fasta sequences

Sc2rf - SARS-Cov-2 Recombinant Finder Pronounced: Scarf What's this? Sc2rf can search genome sequences of SARS-CoV-2 for potential recombinants - new

Lena Schimmel 41 Oct 03, 2022
DumpSMBShare - A script to dump files and folders remotely from a Windows SMB share

DumpSMBShare A script to dump files and folders remotely from a Windows SMB shar

Podalirius 178 Jan 06, 2023
KoCLIP: Korean port of OpenAI CLIP, in Flax

KoCLIP This repository contains code for KoCLIP, a Korean port of OpenAI's CLIP. This project was conducted as part of Hugging Face's Flax/JAX communi

Jake Tae 100 Jan 02, 2023
An imperfect information game is a type of game with asymmetric information

DecisionHoldem An imperfect information game is a type of game with asymmetric information. Compared with perfect information game, imperfect informat

Decision AI 25 Dec 23, 2022
official implementation for the paper "Simplifying Graph Convolutional Networks"

Simplifying Graph Convolutional Networks Updates As pointed out by #23, there was a subtle bug in our preprocessing code for the reddit dataset. After

Tianyi 727 Jan 01, 2023