Implementation of Convolutional enhanced image Transformer

Last update: Dec 13, 2022

Overview

CeiT : Convolutional enhanced image Transformer

This is an unofficial PyTorch implementation of Incorporating Convolution Designs into Visual Transformers .

Training :

python train.py -c configs/default.yaml --name "name_of_exp"

Usage :

import torch
from ceit import CeiT

img = torch.ones([1, 3, 224, 224])
    
model = CeiT(image_size = 224, patch_size = 4, num_classes = 100)
out = model(img)

print("Shape of out :", out.shape)      # [B, num_classes]

model = CeiT(image_size = 224, patch_size = 4, num_classes = 100, with_lca = True)
out = model(img)

print("Shape of out :", out.shape)      # [B, num_classes]

Note :

LCA might not be properly implemented.

Citation :

@misc{yuan2021incorporating,
      title={Incorporating Convolution Designs into Visual Transformers}, 
      author={Kun Yuan and Shaopeng Guo and Ziwei Liu and Aojun Zhou and Fengwei Yu and Wei Wu},
      year={2021},
      eprint={2103.11816},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement :

Base ViT code is borrowed from @lucidrains repo : https://github.com/lucidrains/vit-pytorch
Training and dataloader code is borrowed from @jeonsworld repo : https://github.com/jeonsworld/ViT-pytorch

Implementation of Convolutional enhanced image Transformer

Related tags

Overview

CeiT : Convolutional enhanced image Transformer

Training :

Usage :

Note :

Citation :

Acknowledgement :

Owner

Rishikesh (ऋषिकेश)

NeoPlay is the project dedicated to ESport events.

Hack Camera, Microphone, Location, Clipboard With Just a Link. Also, Get Many Details About Victim's Device. And So On...

Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

QTool: A Low-bit Quantization Toolbox for Deep Neural Networks in Computer Vision

Notification Triggers for Python

Medical Image Segmentation using Squeeze-and-Expansion Transformers

Learning View Priors for Single-view 3D Reconstruction (CVPR 2019)

J.A.R.V.I.S is an AI virtual assistant made in python.

Repository for MDPGT

Sample code and notebooks for Vertex AI, the end-to-end machine learning platform on Google Cloud

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

A custom DeepStack model for detecting 16 human actions.

NFNets and Adaptive Gradient Clipping for SGD implemented in PyTorch

JupyterLite demo deployed to GitHub Pages 🚀

DeepDiffusion: Unsupervised Learning of Retrieval-adapted Representations via Diffusion-based Ranking on Latent Feature Manifold

subpixel: A subpixel convnet for super resolution with Tensorflow

Code for "Long-tailed Distribution Adaptation"

Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data.

🛠️ Tools for Transformers compression using Lightning ⚡

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.