LeViT a Vision Transformer in ConvNet's Clothing for Faster Inference

Last update: Jan 02, 2023

Related tags

Overview

LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference

This repository contains PyTorch evaluation code, training code and pretrained models for LeViT.

They obtain competitive tradeoffs in terms of speed / precision:

For details see LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference by Benjamin Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Hervé Jégou and Matthijs Douze.

If you use this code for a paper please cite:

@article{graham2021levit,
  title={LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference},
  author={Benjamin Graham and Alaaeldin El-Nouby and Hugo Touvron and Pierre Stock and Armand Joulin and Herv\'e J\'egou and Matthijs Douze},
  journal={arXiv preprint arXiv:22104.01136},
  year={2021}
}

Model Zoo

We provide baseline LeViT models trained with distllation on ImageNet 2012.

name	[email protected]	[email protected]	#FLOPs	#params	url
LeViT-128S	76.6	92.9	305M	7.8M	model
LeViT-128	78.6	94.0	406M	9.2M	model
LeViT-192	80.0	94.7	658M	11M	model
LeViT-256	81.6	95.4	1120M	19M	model
LeViT-384	82.6	96.0	2353M	39M	model

Usage

First, clone the repository locally:

git clone https://github.com/facebookresearch/levit.git

Then, install PyTorch 1.7.0+ and torchvision 0.8.1+ and pytorch-image-models:

conda install -c pytorch pytorch torchvision
pip install timm

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

Evaluation

To evaluate a pre-trained LeViT-256 model on ImageNet val with a single GPU run:

python main.py --eval --model LeViT_256 --data-path /path/to/imagenet

This should give

* [email protected] 81.636 [email protected] 95.424 loss 0.750

Training

To train LeViT-256 on ImageNet with hard distillation on a single node with 8 gpus run:

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model LeViT_256 --data-path /path/to/imagenet --output_dir /path/to/save

Multinode training

Distributed training is available via Slurm and submitit:

pip install submitit

To train LeViT-256 model on ImageNet on one node with 8 gpus:

python run_with_submitit.py --model LeViT_256 --data-path /path/to/imagenet

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Contributing

We actively welcome your pull requests! Please see CONTRIBUTING.md and CODE_OF_CONDUCT.md for more info.

LeViT a Vision Transformer in ConvNet's Clothing for Faster Inference

Related tags

Overview

LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference

Model Zoo

Usage

Data preparation

Evaluation

Training

Multinode training

License

Contributing

Owner

Facebook Research

This is the 3D Implementation of 《Inconsistency-aware Uncertainty Estimation for Semi-supervised Medical Image Segmentation》

tensorflow code for inverse face rendering

Image-to-image translation with conditional adversarial nets

official implemntation for "Contrastive Learning with Stronger Augmentations"

Exploring the Dual-task Correlation for Pose Guided Person Image Generation

Repository of Jupyter notebook tutorials for teaching the Deep Learning Course at the University of Amsterdam (MSc AI), Fall 2020

Artificial intelligence technology inferring issues and logically supporting facts from raw text

Code for "Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation". [AAAI 2021]

Codes for AAAI 2022 paper: Context-aware Health Event Prediction via Transition Functions on Dynamic Disease Graphs

Few-shot Neural Architecture Search

A curated list of awesome deep long-tailed learning resources.

Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis (CVPR2022)

Official Pytorch implementation of Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference (ICLR 2022)

Fight Recognition from Still Images in the Wild @ WACVW2022, Real-world Surveillance Workshop

A higher performance pytorch implementation of DeepLab V3 Plus(DeepLab v3+)

Tensorflow implementation of DeepLabv2

Implementation detail for paper "Multi-level colonoscopy malignant tissue detection with adversarial CAC-UNet"

Specification language for generating Generalized Linear Models (with or without mixed effects) from conceptual models

Bot developed in Python that automates races in pegaxy.

phylotorch-bito is a package providing an interface to BITO for phylotorch