Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows.

Last update: Mar 14, 2022

Overview

Swin-Transformer

Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows. For more details, please refer to "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"

This repo is an implementation of MegEngine version Swin-Transformer. This is also a showcase for training on GPU with less memory by leveraging MegEngine DTR technique.

There is also an official PyTorch implementation.

Usage

Install

Clone this repo:

git clone https://github.com/MegEngine/swin-transformer.git
cd swin-transformer

Install megengine==1.6.0

pip3 install megengine==1.6.0 -f https://megengine.org.cn/whl/mge.html

Training

To train a Swin Transformer using random data, run:

python3 -n <num-of-gpus-to-use> -b <batch-size-per-gpu> -s <num-of-train-steps> train_random.py

To train a Swin Transformer using AMP (Auto Mix Precision), run:

python3 -n <num-of-gpus-to-use> -b <batch-size-per-gpu> -s <num-of-train-steps> --mode mp train_random.py

To train a Swin Transformer using DTR in dynamic graph mode, run:

python3 -n <num-of-gpus-to-use> -b <batch-size-per-gpu> -s <num-of-train-steps> --dtr [--dtr-thd <eviction-threshold-of-dtr>] train_random.py

To train a Swin Transformer using DTR in static graph mode, run:

python3 -n <num-of-gpus-to-use> -b <batch-size-per-gpu> -s <num-of-train-steps> --trace --symbolic --dtr --dtr-thd <eviction-threshold-of-dtr> train_random.py

For example, to train a Swin Transformer with a single GPU using DTR in static graph mode with threshold=8GB and AMP, run:

python3 -n 1 -b 340 -s 10 --trace --symbolic --dtr --dtr-thd 8 --mode mp train_random.py

For more usage, run:

python3 train_random.py -h

Benchmark

Testing Devices
- 2080Ti @ cuda-10.1-cudnn-v7.6.3-TensorRT-5.1.5.0 @ Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
- Reserve all CUDA memory by setting MGB_CUDA_RESERVE_MEMORY=1, in order to alleviate memory fragmentation problem

Settings	Maximum Batch Size	Speed(s/step)	Throughput(images/s)
None	68	0.490	139
AMP	100	0.494	202
DTR in static graph mode	300	2.592	116
DTR in static graph mode + AMP	340	1.944	175

Acknowledgement

We are inspired by the Swin-Transformer repository, many thanks to microsoft!

Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows.

Related tags

Overview

Swin-Transformer

Usage

Install

Training

Benchmark

Acknowledgement

Owner

旷视天元 MegEngine

pytorch implementation of GPV-Pose

Python Blood Vessel Topology Analysis

Official implementation of "Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks", NeurIPS 2021.

GND-Nets (Graph Neural Diffusion Networks) in TensorFlow.

Pytorch implementation of Cut-Thumbnail in the paper Cut-Thumbnail:A Novel Data Augmentation for Convolutional Neural Network.

A high performance implementation of HDBSCAN clustering.

Spatial Contrastive Learning for Few-Shot Classification (SCL)

Autoencoder - Reducing the Dimensionality of Data with Neural Network

Implementation of "GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings" in PyTorch

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

IDA file loader for UF2, created for the DEFCON 29 hardware badge

StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators

Pytorch tutorials for Neural Style transfert

Source code for our paper "Do Not Trust Prediction Scores for Membership Inference Attacks"

This is an official implementation of "Polarized Self-Attention: Towards High-quality Pixel-wise Regression"

Awesome Human Pose Estimation

tensorrt int8 量化yolov5 4.0 onnx模型

Make Watson Assistant send messages to your Discord Server

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Pytorch Implementation of "Diagonal Attention and Style-based GAN for Content-Style disentanglement in image generation and translation" (ICCV 2021)