Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows.

Last update: Mar 14, 2022

Overview

Swin-Transformer

Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows. For more details, please refer to "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows"

This repo is an implementation of MegEngine version Swin-Transformer. This is also a showcase for training on GPU with less memory by leveraging MegEngine DTR technique.

There is also an official PyTorch implementation.

Usage

Install

Clone this repo:

git clone https://github.com/MegEngine/swin-transformer.git
cd swin-transformer

Install megengine==1.6.0

pip3 install megengine==1.6.0 -f https://megengine.org.cn/whl/mge.html

Training

To train a Swin Transformer using random data, run:

python3 -n <num-of-gpus-to-use> -b <batch-size-per-gpu> -s <num-of-train-steps> train_random.py

To train a Swin Transformer using AMP (Auto Mix Precision), run:

python3 -n <num-of-gpus-to-use> -b <batch-size-per-gpu> -s <num-of-train-steps> --mode mp train_random.py

To train a Swin Transformer using DTR in dynamic graph mode, run:

python3 -n <num-of-gpus-to-use> -b <batch-size-per-gpu> -s <num-of-train-steps> --dtr [--dtr-thd <eviction-threshold-of-dtr>] train_random.py

To train a Swin Transformer using DTR in static graph mode, run:

python3 -n <num-of-gpus-to-use> -b <batch-size-per-gpu> -s <num-of-train-steps> --trace --symbolic --dtr --dtr-thd <eviction-threshold-of-dtr> train_random.py

For example, to train a Swin Transformer with a single GPU using DTR in static graph mode with threshold=8GB and AMP, run:

python3 -n 1 -b 340 -s 10 --trace --symbolic --dtr --dtr-thd 8 --mode mp train_random.py

For more usage, run:

python3 train_random.py -h

Benchmark

Testing Devices
- 2080Ti @ cuda-10.1-cudnn-v7.6.3-TensorRT-5.1.5.0 @ Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
- Reserve all CUDA memory by setting MGB_CUDA_RESERVE_MEMORY=1, in order to alleviate memory fragmentation problem

Settings	Maximum Batch Size	Speed(s/step)	Throughput(images/s)
None	68	0.490	139
AMP	100	0.494	202
DTR in static graph mode	300	2.592	116
DTR in static graph mode + AMP	340	1.944	175

Acknowledgement

We are inspired by the Swin-Transformer repository, many thanks to microsoft!

Swin-Transformer is basically a hierarchical Transformer whose representation is computed with shifted windows.

Related tags

Overview

Swin-Transformer

Usage

Install

Training

Benchmark

Acknowledgement

Owner

旷视天元 MegEngine

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

ICRA 2021 - Robust Place Recognition using an Imaging Lidar

A public available dataset for road boundary detection in aerial images

NeRF Meta-Learning with PyTorch

Breast Cancer Detection 🔬 ITI "AI_Pro" Graduation Project

Tools for investing in Python

Safe Model-Based Reinforcement Learning using Robust Control Barrier Functions

ZeroVL - The official implementation of ZeroVL

Automatically creates genre collections for your Plex media

Generating retro pixel game characters with Generative Adversarial Networks. Dataset "TinyHero" included.

Official repository of the AAAI'2022 paper "Contrast and Generation Make BART a Good Dialogue Emotion Recognizer"

Dynamic vae - Dynamic VAE algorithm is used for anomaly detection of battery data

Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR)

This code is a toolbox that uses Torch library for training and evaluating the ERFNet architecture for semantic segmentation.

A simple pytorch pipeline for semantic segmentation.

A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility

RATE: Overcoming Noise and Sparsity of Textual Features in Real-Time Location Estimation (CIKM'17)

Graph-total-spanning-trees - A Python script to get total number of Spanning Trees in a Graph

Implementation for paper LadderNet: Multi-path networks based on U-Net for medical image segmentation

This repository for project that can Automate Number Plate Recognition (ANPR) in Morocco Licensed Vehicles. 💻 + 🚙 + 🇲🇦 = 🤖 🕵🏻‍♂️