Implementation of momentum^2 teacher

Last update: Sep 26, 2022

Related tags

Deep Learning momentum2-teacher

Overview

Momentum^{^2} Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning

Requirements

All experiments are done with python3.6, torch==1.5.0; torchvision==0.6.0

Usage

Data Preparation

Prepare the ImageNet data in ${root_of_your_clone}/data/imagenet_train, ${root_of_your_clone}/data/imagenet_val. Since we have an internal platform(storage) to read imagenet, I have not tried the local mode. You may need to do some modification in momentum_teacher/data/dataset.py to support the local mode.

Training

Before training, ensure the path (namely ${root_of_clone}) is added in your PYTHONPATH, e.g.

export PYTHONPATH=$PYTHONPATH:${root_of_clone}

To do unsupervised pre-training of a ResNet-50 model on ImageNet in an 8-gpu machine, run:

using -d to specify gpu_id for training, e.g., -d 0-7
using -b to specify batch_size, e.g., -b 256
using --experiment-name to specify the output folder, and the training log & models will be dumped to './outputs/${experiment-name}'
using -f to specify the description file of ur experiment.

e.g.,

python3 momentum_teacher/tools/train.py -b 256 -d 0-7 --experiment-name your_exp -f momentum_teacher/exps/arxiv/exp_8_v100/momentum2_teacher_100e_exp.py

Linear Evaluation:

With a pre-trained model, to train a supervised linear classifier on frozen features/weights in an 8 gpus machine, run:

using -d to specify gpu_id for training, e.g., -d 0-7
using -b to specify batch_size, e.g., -b 256
using --experiment-name to specify the folder for saving pre-training models.

python3 momentum_teacher/tools/eval.py -b 256 --experiment-name your_exp -f momentum_teacher/exps/arxiv/linear_eval_exp_byol.py

Results

Results of Pretraining on a Single Machine

After pretraining on 8 NVIDIA V100 GPUS and 1024 batch-sizes, the results of linear-evaluation are:

pre-train code	pre-train epochs	pre-train time	accuracy	weights
path	100	~1.8 day	70.7	-
path	200	~3.6 day	72.7	-
path	300	~5.5 day	73.8	-

After pretraining on 8 NVIDIA 2080 GPUS and 256 batch-sizes, the results of linear-evaluation are:

pre-train code	pre-train epochs	pre-train time	accuracy	wights
path	100	~2.5 day	70.4	-
path	200	~5 day	72.3	-
path	300	~7.5 day	72.9	-

Results of Pretraining on Multiple Machines

E.g., To do unsupervised pre-training with 4096 batch-sizes and 32 V100 GPUs. run:

Suggesting that each machine has 8 V100 GPUs and there are 4 machines

# machine 1:
export MACHINE=0; export MACHINE_TOTAL=4; python3 momentum_teacher/tools/train.py -b 4096 -f xxx
# machine 2:
export MACHINE=1; export MACHINE_TOTAL=4; python3 momentum_teacher/tools/train.py -b 4096 -f xxx
# machine 3:
export MACHINE=2; export MACHINE_TOTAL=4; python3 momentum_teacher/tools/train.py -b 4096 -f xxx
# machine 4:
export MACHINE=3; export MACHINE_TOTAL=4; python3 momentum_teacher/tools/train.py -b 4096 -f xxx

results of linear-eval:

pre-train code	pre-train epochs	pre-train time	accuracy	weights
path	100	~11hour	70.3	-
path	200	~22hour	72.5	-
path	300	~33hour	73.7	-

To do unsupervised pre-training with 4096 batch-sizes and 128 2080 GPUs, pls follow the above guides. Results of linear-eval:

pre-train code	pre-train epochs	pre-train time	accuracy	weights
path	100	~5hour	69.0	-
path	200	~10hour	71.5	-
path	300	~15hour	72.3	-

Disclaimer

This is an implementation for Momentum^2 Teacher, it is worth noting that:

The original implementation is based on our internal Platform.
This released version has slightly better performances compared with the tech report's.

Implementation of momentum^2 teacher

Related tags

Overview

Momentum^{^2} Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning

Requirements

Usage

Data Preparation

Training

Linear Evaluation:

Results

Results of Pretraining on a Single Machine

Results of Pretraining on Multiple Machines

Disclaimer

Owner

jemmy li

Serving PyTorch 1.0 Models as a Web Server in C++

This is an early in-development version of training CLIP models with hivemind.

A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

OpenPCDet Toolbox for LiDAR-based 3D Object Detection.

The implementation of the paper "HIST: A Graph-based Framework for Stock Trend Forecasting via Mining Concept-Oriented Shared Information".

IEEE Winter Conference on Applications of Computer Vision 2022 Accepted

Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.

Revisting Open World Object Detection

Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

LSTMs (Long Short Term Memory) RNN for prediction of price trends

Leibniz is a python package which provide facilities to express learnable partial differential equations with PyTorch

A criticism of a recent paper on buggy image downsampling methods in popular image processing and deep learning libraries.

CondNet: Conditional Classifier for Scene Segmentation

Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

git《Joint Entity and Relation Extraction with Set Prediction Networks》(2020) GitHub:

Resilience from Diversity: Population-based approach to harden models against adversarial attacks

UCSD Oasis platform

A curated list of automated deep learning (including neural architecture search and hyper-parameter optimization) resources.

MaRS - a recursive filtering framework that allows for truly modular multi-sensor integration

Implementation of momentum^2 teacher

Related tags

Overview

Momentum^2 Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning

Requirements

Usage

Data Preparation

Training

Linear Evaluation:

Results

Results of Pretraining on a Single Machine

Results of Pretraining on Multiple Machines

Disclaimer

Owner

jemmy li

Serving PyTorch 1.0 Models as a Web Server in C++

This is an early in-development version of training CLIP models with hivemind.

A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

Deep Image Search is an AI-based image search engine that includes deep transfor learning features Extraction and tree-based vectorized search.

OpenPCDet Toolbox for LiDAR-based 3D Object Detection.

The implementation of the paper "HIST: A Graph-based Framework for Stock Trend Forecasting via Mining Concept-Oriented Shared Information".

IEEE Winter Conference on Applications of Computer Vision 2022 Accepted

Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.

Revisting Open World Object Detection

Official source code to CVPR'20 paper, "When2com: Multi-Agent Perception via Communication Graph Grouping"

LSTMs (Long Short Term Memory) RNN for prediction of price trends

Leibniz is a python package which provide facilities to express learnable partial differential equations with PyTorch

A criticism of a recent paper on buggy image downsampling methods in popular image processing and deep learning libraries.

CondNet: Conditional Classifier for Scene Segmentation

Third party Pytorch implement of Image Processing Transformer (Pre-Trained Image Processing Transformer arXiv:2012.00364v2)

git《Joint Entity and Relation Extraction with Set Prediction Networks》(2020) GitHub:

Resilience from Diversity: Population-based approach to harden models against adversarial attacks

UCSD Oasis platform

A curated list of automated deep learning (including neural architecture search and hyper-parameter optimization) resources.

MaRS - a recursive filtering framework that allows for truly modular multi-sensor integration

Momentum^{^2} Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning