PyTorch implementation of the paper Dynamic Token Normalization Improves Vision Transfromers.

Last update: Oct 09, 2022

Related tags

Overview

Dynamic Token Normalization Improves Vision Transformers

This is the PyTorch implementation of the paper Dynamic Token Normalization Improves Vision Transfromers. Codea and Models will be available soon.

Dynamic Token Normalization

We design a novel normalization method, termed Dynamic Token Normalization (DTN), which inherits the advantages from LayerNorm and InstanceNorm. DTN can be seamlessly plugged into various transformer models, consistenly improving the performance.

Comparisons of top-1 accuracies on the validation set of ImageNet, by using ViT trained with LN and DTN.

Model	Top-1	Top-5
ViT-T*-LN	72.3	91.4
ViT-T*-DTN	73.2	91.7
ViT-S*-LN	80.6	95.2
ViT-S*-DTN	81.7	95.8
ViT-B*-LN	81.7	95.8
ViT-B*-DTN	82.5	96.1

Getting Started

Install PyTorch

Clone the repo:

git clone https://github.com/dtn-anonymous/DTN.git

Requirements

Install CUDA==10.1 with cudnn7 following the official installation instructions
Install PyTorch==1.7.1 and torchvision==0.8.2 with CUDA==10.1:

conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.1 -c pytorch

Install timm==0.3.2:

pip install timm==0.3.2

Data Preparation

Download the ImageNet dataset which should contain train and val directionary and the txt file for correspondings between images and labels.

Training a model from scratch

An example to train our DTN is given in DTN/scripts/train.sh. To train ViT-S* with our DTN,

cd DTN/scripts   
sh train.sh layer vit_norm_s_star configs/ViT/vit.yaml

Number of GPUs and configuration file to use can be modified in train.sh

PyTorch implementation of the paper Dynamic Token Normalization Improves Vision Transfromers.

Related tags

Overview

Dynamic Token Normalization Improves Vision Transformers

Dynamic Token Normalization

Getting Started

Requirements

Data Preparation

Training a model from scratch

Owner

Wenqi Shao

Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Two types of Recommender System : Content-based Recommender System and Colaborating filtering based recommender system

Avatarify Python - Avatars for Zoom, Skype and other video-conferencing apps.

Joint parameterization and fitting of stroke clusters

GLIP: Grounded Language-Image Pre-training

Ἀνατομή is a PyTorch library to analyze representation of neural networks

Spectrum Surveying: Active Radio Map Estimation with Autonomous UAVs

A High-Quality Real Time Upscaler for Anime Video

PyTorch code for our paper "Gated Multiple Feedback Network for Image Super-Resolution" (BMVC2019)

This repository includes the code of the sequence-to-sequence model for discontinuous constituent parsing described in paper Discontinuous Grammar as a Foreign Language.

PyTorch implementation of the paper Dynamic Data Augmentation with Gating Networks

Point-NeRF: Point-based Neural Radiance Fields

An implementation of Equivariant e2 convolutional kernals into a convolutional self attention network, applied to radio astronomy data.

A curated list of Generative Deep Art projects, tools, artworks, and models

Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021)

A PyTorch implementation of "Semi-Supervised Graph Classification: A Hierarchical Graph Perspective" (WWW 2019)

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,

Deep Learning and Logical Reasoning from Data and Knowledge

PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

This code is part of the reproducibility package for the SANER 2022 paper "Generating Clarifying Questions for Query Refinement in Source Code Search".

PyTorch implementation of the paper Dynamic Token Normalization Improves Vision Transfromers.

Related tags

Overview

Dynamic Token Normalization Improves Vision Transformers

Dynamic Token Normalization

Getting Started

Requirements

Data Preparation

Training a model from scratch

Owner

Wenqi Shao

Deep Text Search is an AI-powered multilingual text search and recommendation engine with state-of-the-art transformer-based multilingual text embedding (50+ languages).

Two types of Recommender System : Content-based Recommender System and Colaborating filtering based recommender system

Avatarify Python - Avatars for Zoom, Skype and other video-conferencing apps.

Joint parameterization and fitting of stroke clusters

GLIP: Grounded Language-Image Pre-training

Ἀνατομή is a PyTorch library to analyze representation of neural networks

Spectrum Surveying: Active Radio Map Estimation with Autonomous UAVs

A High-Quality Real Time Upscaler for Anime Video

PyTorch code for our paper "Gated Multiple Feedback Network for Image Super-Resolution" (BMVC2019)

This repository includes the code of the sequence-to-sequence model for discontinuous constituent parsing described in paper Discontinuous Grammar as a Foreign Language.

PyTorch implementation of the paper Dynamic Data Augmentation with Gating Networks

Point-NeRF: Point-based Neural Radiance Fields

An implementation of Equivariant e2 convolutional kernals into a convolutional self attention network, applied to radio astronomy data.

A curated list of Generative Deep Art projects, tools, artworks, and models

Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021)

A PyTorch implementation of "Semi-Supervised Graph Classification: A Hierarchical Graph Perspective" (WWW 2019)

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音 合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,

Deep Learning and Logical Reasoning from Data and Knowledge

PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

This code is part of the reproducibility package for the SANER 2022 paper "Generating Clarifying Questions for Query Refinement in Source Code Search".

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,