Public Code for NIPS submission SimiGrad: Fine-Grained Adaptive Batching for Large ScaleTraining using Gradient Similarity Measurement

Related tags

Deep LearningSimiGrad
Overview

Public code for NIPS submission "SimiGrad: Fine-Grained Adaptive Batching for Large Scale Training using Gradient Similarity Measurement"

This repo contains both our SimiGrad framework (integrated with DeepSpeed) and all training codes used to generate the results in the paper.

Installation

Please use ./DeepSpeed/install.sh to install our SimiGrad framework. For detailed installation options please see ./DeepSpeed/install.sh . It is recommended that you use a virtual environment to install SimiGrad.

Usage

To use SimiGrad, simply add an additional parameter adaptive_batch_params when initializing DeepSpeed. For example,

model, optimizer, _, _ = deepspeed.initialize(
        args=...,
        model=...,
        model_parameters=...,
        adaptive_batch_params={
            "enable_adjust": args.similarity_target, # bool, set to `True` to use adaptive batch size and `False` for fixed batch size
            "verbose": True, # bool, set to `True` to print details of batch size adjustment
            "similarity_target":args.similarity_target, # float, -1.0~1.0, the similarity target that controls how aggressive the batch size adjustment is.
            "batch_size_lower_bound":args.batchsize_lower_bound, # int, optional, the lower bound of batch size. Recommended only if you have a well-tuned warmup learning rate scheduling.
            "batch_size_upper_bound":args.batchsize_upper_bound, # int, optional, the upper bound of batch size.
            "max_micro_batch_size":args.max_micro_batch_size, # int, optional, the upper bound of micro batch size to prevent out-of-memory error. If unspecified, the initial micro batch size will be used as the max_micro_batch_size.})

Please refer to our code (e.g. DeepSpeedExamples/pytorch-cifar/main.py) for details such as how to read the metrics from the framework.

For usage of DeepSpeed, please refer to their website https://www.deepspeed.ai/

Reproduce Paper's Results

The parameters we used to get the claimed results are included in the paper.

BERT Large Pretrain

All scripts can be found in DeepSpeedExamples/bert_pretrain/. Please use the script ds_train_bert_bsz64k_seq128.sh for BERT Large pretrain with sequence length 128 (epoch 1-150). You need to specify the parameters like similarity_target and also the location of the WikiandBookCorpus dataset in the script.

After the sequence length 128 pretrain, use ds_train_bert_bsz32k_seq512.sh to finish the sequence length 512 part of pretrain (epoch 151-170). You need to specify the checkpoint from sequence length 128 pretrain for the sequence length 512 to start with. Then the BERT Large model is ready for downstream tasks.

SQuAD Score from BERT Large Pretrain

After the BERT pretrain, use DeepSpeedExamples/BingBertSquad/run_squad_deepspeed.sh to get the SQuAD 1.1 score. You need to specify the checkpoint from sequence length 512 pretrain and the location of SQuAD 1.1 dataset.

ResNet18 on CIFAR10

All scripts can be found in DeepSpeedExamples/pytorch-cifar/. Use the script run.sh to train ResNet18 with specific parameters. Use the grid_search.py and baseline_grid_search.py to get the Pareto results of test acc vs. batch size in the paper.

ResNet50 on ImageNet

All scripts can be found in DeepSpeedExamples/imagenet_deepspeed/. Use the script run_with2kmin.sh to train ResNet50 with spcific parameters.

Future of SimiGrad

SimiGrad will be officially integrated as part of DeepSpeed soon!

Owner
Heyang Qin
Heyang Qin
Here we present the implementation in TensorFlow of our work about liver lesion segmentation accepted in the Machine Learning 4 Health Workshop

Detection-aided liver lesion segmentation Here we present the implementation in TensorFlow of our work about liver lesion segmentation accepted in the

Image Processing Group - BarcelonaTECH - UPC 96 Oct 26, 2022
Scientific Computation Methods in C and Python (Open for Hacktoberfest 2021)

Sci - cpy README is a stub. Do expand it. Objective This repository is meant to be a ready reference for scientific computation methods. Do ⭐ it if yo

Sandip Dutta 7 Oct 12, 2022
[CVPR 2022] Structured Sparse R-CNN for Direct Scene Graph Generation

Structured Sparse R-CNN for Direct Scene Graph Generation Our paper Structured Sparse R-CNN for Direct Scene Graph Generation has been accepted by CVP

Multimedia Computing Group, Nanjing University 44 Dec 23, 2022
Unofficial PyTorch implementation of SimCLR by Google Brain

Unofficial PyTorch implementation of SimCLR by Google Brain

Rishabh Anand 2 Oct 13, 2021
Implementation of Bagging and AdaBoost Algorithm

Bagging-and-AdaBoost Implementation of Bagging and AdaBoost Algorithm Dataset Red Wine Quality Data Sets For simplicity, we will have 2 classes of win

Zechen Ma 1 Nov 01, 2021
AFL binary instrumentation

E9AFL --- Binary AFL E9AFL inserts American Fuzzy Lop (AFL) instrumentation into x86_64 Linux binaries. This allows binaries to be fuzzed without the

242 Dec 12, 2022
Official implementation for the paper: Generating Smooth Pose Sequences for Diverse Human Motion Prediction

Generating Smooth Pose Sequences for Diverse Human Motion Prediction This is official implementation for the paper Generating Smooth Pose Sequences fo

Wei Mao 28 Dec 10, 2022
LSTMs (Long Short Term Memory) RNN for prediction of price trends

Price Prediction with Recurrent Neural Networks LSTMs BTC-USD price prediction with deep learning algorithm. Artificial Neural Networks specifically L

5 Nov 12, 2021
Links to works on deep learning algorithms for physics problems, TUM-I15 and beyond

Links to works on deep learning algorithms for physics problems, TUM-I15 and beyond

Nils Thuerey 1.3k Jan 08, 2023
Writeups for the challenges from DownUnderCTF 2021

cloud Challenge Author Difficulty Release Round Bad Bucket Blue Alder easy round 1 Not as Bad Bucket Blue Alder easy round 1 Lost n Found Blue Alder m

DownUnderCTF 161 Dec 31, 2022
PyTorch implementation of paper: AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer, ICCV 2021.

AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer [Paper] [PyTorch Implementation] [Paddle Implementation] Overview This reposit

148 Dec 30, 2022
Keras implementations of Generative Adversarial Networks.

This repository has gone stale as I unfortunately do not have the time to maintain it anymore. If you would like to continue the development of it as

Erik Linder-Norén 8.9k Jan 04, 2023
Official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution"

RealBasicVSR [Paper] This is the official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution, arXiv". This repository contain

Kelvin C.K. Chan 566 Dec 28, 2022
QR2Pass-project - A proof of concept for an alternative (passwordless) authentication system to a web server

QR2Pass This is a proof of concept for an alternative (passwordless) authenticat

4 Dec 09, 2022
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

Introduction PyTorch3D provides efficient, reusable components for 3D Computer Vision research with PyTorch. Key features include: Data structure for

Facebook Research 6.8k Jan 01, 2023
QA-GNN: Question Answering using Language Models and Knowledge Graphs

QA-GNN: Question Answering using Language Models and Knowledge Graphs This repo provides the source code & data of our paper: QA-GNN: Reasoning with L

Michihiro Yasunaga 434 Jan 04, 2023
MTCNN face detection implementation for TensorFlow, as a PIP package.

MTCNN Implementation of the MTCNN face detector for Keras in Python3.4+. It is written from scratch, using as a reference the implementation of MTCNN

Iván de Paz Centeno 1.9k Dec 30, 2022
A annotation of yolov5-5.0

代码版本:0714 commit #4000 $ git clone https://github.com/ultralytics/yolov5 $ cd yolov5 $ git checkout 720aaa65c8873c0d87df09e3c1c14f3581d4ea61 这个代码只是注释版

Laughing 229 Dec 17, 2022
Neural Turing Machines (NTM) - PyTorch Implementation

PyTorch Neural Turing Machine (NTM) PyTorch implementation of Neural Turing Machines (NTM). An NTM is a memory augumented neural network (attached to

Guy Zana 519 Dec 21, 2022
Match SafeGraph POIs with Data collected through a cultural resource survey in Washington DC.

Match SafeGraph POI data with Cultural Resource Places in Washington DC Match SafeGraph POIs with Data collected through a cultural resource survey in

Changjie Chen 1 Jan 05, 2022