SiT: Self-supervised vIsion Transformer

Last update: Dec 28, 2022

Related tags

Overview

SiT: Self-supervised vIsion Transformer

This repository contains the official PyTorch self-supervised pretraining, finetuning, and evaluation codes for SiT (Self-supervised image Transformer).

The training strategy is adopted from Deit

Usage

Create an environment

conda create -n SiT python=3.8

Activate the environment and install the necessary packages

conda activate SiT

conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch

pip install -r requirements.txt

Self-supervised pre-training

python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --batch-size 72 --epochs 501 --min-lr 5e-6 --lr 1e-3 --training-mode 'SSL' --data-set 'STL10' --output 'checkpoints/SSL/STL10' --validate-every 10

Finetuning

python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --batch-size 120 --epochs 501 --min-lr 5e-6 --training-mode 'finetune' --data-set 'STL10' --finetune 'checkpoints/SSL/STL10/checkpoint.pth' --output 'checkpoints/finetune/STL10' --validate-every 10

Linear Evaluation

Linear projection Head

python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --batch-size 120 --epochs 501 --lr 1e-3 --weight-decay 5e-4 --min-lr 5e-6 --training-mode 'finetune' --data-set 'STL10' --finetune 'checkpoints/SSL/STL10/checkpoint.pth' --output 'checkpoints/finetune/STL10_LE' --validate-every 10 --SiT_LinearEvaluation 1

2-layer MLP projection Head

python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --batch-size 120 --epochs 501 --lr 1e-3 --weight-decay 5e-4 --min-lr 5e-6 --training-mode 'finetune' --data-set 'STL10' --finetune 'checkpoints/SSL/STL10/checkpoint.pth' --output 'checkpoints/finetune/STL10_LE_hidden' --validate-every 10 --SiT_LinearEvaluation 1 --representation-size 1024

Note: assign the --dataset_location parameter to the location of the downloaded dataset

If you use this code for a paper, please cite:

@article{atito2021sit,

  title={SiT: Self-supervised vIsion Transformer},

  author={Atito, Sara and Awais, Muhammad and Kittler, Josef},

  journal={arXiv preprint arXiv:2104.03602},

  year={2021}

}

License

This repository is released under the GNU General Public License.

SiT: Self-supervised vIsion Transformer

Related tags

Overview

SiT: Self-supervised vIsion Transformer

Usage

Self-supervised pre-training

Finetuning

Linear Evaluation

License

Owner

Sara Ahmed

Experiments on continual learning from a stream of pretrained models.

A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon.

[NeurIPS 2020] Blind Video Temporal Consistency via Deep Video Prior

Attentive Implicit Representation Networks (AIR-Nets)

Pairwise learning neural link prediction for ogb link prediction

A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

Code for the paper "Implicit Representations of Meaning in Neural Language Models"

Consecutive-Subsequence - Simple software to calculate susequence with highest sum

Bayesian algorithm execution (BAX)

PyTorch implementation for SDEdit: Image Synthesis and Editing with Stochastic Differential Equations

Facial Expression Detection In The Realtime

JDet is Object Detection Framework based on Jittor.

Code for the paper 'A High Performance CRF Model for Clothes Parsing'.

Virtual hand gesture mouse using a webcam

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation

Code for Max-Margin Contrastive Learning - AAAI 2022

Tensorflow implementation of ID-Unet: Iterative Soft and Hard Deformation for View Synthesis.

Evolution Strategies in PyTorch

Advantage Actor Critic (A2C): jax + flax implementation