RealFormer-Pytorch Implementation of RealFormer using pytorch

Last update: Dec 08, 2022

Related tags

Overview

RealFormer-Pytorch

Implementation of RealFormer using pytorch. Includes comparison with classical Transformer on image classification task (ViT) wrt CIFAR-10 dataset.

Original Paper of the model : https://arxiv.org/abs/2012.11747

So how are RealFormers at vision tasks?

Run the train.py with

model = ViR(
        image_pix = 32,
        patch_pix = 4,
        class_cnt = 10,
        layer_cnt = 4
    )

to Test how RealFormer works on CIFAR-10 dataset compared to just classical ViT, which is

model = ViT(
        image_pix = 32,
        patch_pix = 4,
        class_cnt = 10,
        layer_cnt = 4
    )

... which is of course, much, much smaller version of ViT compared to the origianl ones ().

Results

Model : layers = 4, hidden_dim = 128, feedforward_dim = 512, head_cnt = 4

Trained 10 epochs

After 10'th epoch, Realformer achieves 65.45% while Transformer achieves 64.59% RealFormer seems to consistently have about 1% greater accuracy, which seems reasonable (as the papaer suggested simillar result)

Model : layers = 8, hidden_dim = 128, feedforward_dim = 512, head_cnt = 4

Having 4 more layers obviously improves in general, and still, RealFormer consistently wins in terms of accuracy (68.3% vs 66.3%). Notice that larger the model, bigger the difference seems to follow here too. (I wonder how much of difference it would make on ViT-Large)

When it comes to computation time, there was almost zero difference. (I guess adding residual attention score is O(L^2) operation, compared to matrix multiplication in softmax which is O(L^2 * D))

Conclusion

Use RealFormer. It benifits with almost zero additional resource!

To make a custom RealFormer for other tasks

Its not a pip package, but you can use the ResEncoderBlock module in the models.py to make a Encoder Only Transformer like the following :

import ResEncoderBlock from models

def RealFormer(nn.Module):
...
  def __init__(self, ...):
  ...
    self.mains = nn.Sequential(*[ResEncoderBlock(emb_s = 32, head_cnt = 8, dp1 = 0.1, dp2 = 0.1) for _ in range(layer_cnt)])
  ...
  def forward(self, x):
  ...
    prev = None
    for resencoder in self.mains:
        x, prev = resencoder(x, prev = prev)
  ...
    return x

If you're not really clear what is going on or what to do, request me to make this a pip package.

RealFormer-Pytorch Implementation of RealFormer using pytorch

Related tags

Overview

RealFormer-Pytorch

So how are RealFormers at vision tasks?

Results

Conclusion

To make a custom RealFormer for other tasks

Owner

Simo Ryu

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective

IRON Kaggle project done while doing IRONHACK Bootcamp where we had to analyze and use a Machine Learning Project to predict future sales

Educational 2D SLAM implementation based on ICP and Pose Graph

A general-purpose encoder-decoder framework for Tensorflow

🌊 Online machine learning in Python

This Jupyter notebook shows one way to implement a simple first-order low-pass filter on sampled data in discrete time.

The fastai deep learning library

Volumetric parameterization of the placenta to a flattened template

The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".

Patch2Pix: Epipolar-Guided Pixel-Level Correspondences [CVPR2021]

EGNN - Implementation of E(n)-Equivariant Graph Neural Networks, in Pytorch

Interactive Visualization to empower domain experts to align ML model behaviors with their knowledge.

This is the repository for Learning to Generate Piano Music With Sustain Pedals

A curated list of programmatic weak supervision papers and resources

WRENCH: Weak supeRvision bENCHmark

Simple PyTorch implementations of Badnets on MNIST and CIFAR10.

PaddleBoBo是基于PaddlePaddle和PaddleSpeech、PaddleGAN等开发套件的虚拟主播快速生成项目

[NeurIPS 2021] Low-Rank Subspaces in GANs

Wider or Deeper: Revisiting the ResNet Model for Visual Recognition

DeepFaceLive - Live Deep Fake in python, Real-time face swap for PC streaming or video calls