Implementation of Convolutional enhanced image Transformer

Last update: Dec 13, 2022

Overview

CeiT : Convolutional enhanced image Transformer

This is an unofficial PyTorch implementation of Incorporating Convolution Designs into Visual Transformers .

Training :

python train.py -c configs/default.yaml --name "name_of_exp"

Usage :

import torch
from ceit import CeiT

img = torch.ones([1, 3, 224, 224])
    
model = CeiT(image_size = 224, patch_size = 4, num_classes = 100)
out = model(img)

print("Shape of out :", out.shape)      # [B, num_classes]

model = CeiT(image_size = 224, patch_size = 4, num_classes = 100, with_lca = True)
out = model(img)

print("Shape of out :", out.shape)      # [B, num_classes]

Note :

LCA might not be properly implemented.

Citation :

@misc{yuan2021incorporating,
      title={Incorporating Convolution Designs into Visual Transformers}, 
      author={Kun Yuan and Shaopeng Guo and Ziwei Liu and Aojun Zhou and Fengwei Yu and Wei Wu},
      year={2021},
      eprint={2103.11816},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement :

Base ViT code is borrowed from @lucidrains repo : https://github.com/lucidrains/vit-pytorch
Training and dataloader code is borrowed from @jeonsworld repo : https://github.com/jeonsworld/ViT-pytorch

Implementation of Convolutional enhanced image Transformer

Related tags

Overview

CeiT : Convolutional enhanced image Transformer

Training :

Usage :

Note :

Citation :

Acknowledgement :

Owner

Rishikesh (ऋषिकेश)

The official code for paper "R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling".

CAPITAL: Optimal Subgroup Identification via Constrained Policy Tree Search

PyTorch implementation of Spiking Neural Networks trained on surrogate gradient & BPTT using snntorch.

This repository contains the re-implementation of our paper deSpeckNet: Generalizing Deep Learning Based SAR Image Despeckling

Repository containing the PhD Thesis "Formal Verification of Deep Reinforcement Learning Agents"

Shuwa Gesture Toolkit is a framework that detects and classifies arbitrary gestures in short videos

Implementation of the state-of-the-art vision transformers with tensorflow

A library to inspect itermediate layers of PyTorch models.

A simple pytorch pipeline for semantic segmentation.

Breaching - Breaching privacy in federated learning scenarios for vision and text

AITom is an open-source platform for AI driven cellular electron cryo-tomography analysis.

Official implementation of Neural Bellman-Ford Networks (NeurIPS 2021)

DeepGNN is a framework for training machine learning models on large scale graph data.

In this project, two programs can help you take full agvantage of time on the model training with a remote server

Official Code Implementation of the paper : XAI for Transformers: Better Explanations through Conservative Propagation

Differentiable Abundance Matching With Python

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

A PyTorch implementation of "Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning", IJCAI-21

SymPy-powered, Wolfram|Alpha-like answer engine totally in your browser, without backend computation

Markov Attention Models