Near-Optimal Sparse Allreduce for Distributed Deep Learning (published in PPoPP'22)

Related tags

Deep LearningOk-Topk
Overview

Near-Optimal Sparse Allreduce for Distributed Deep Learning (published in PPoPP'22)

Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k communication volume which is asymptotically optimal) with the decentralized parallel Stochastic Gradient Descent (SGD) optimizer, and its convergence is proved theoretically and empirically.

Setup the environment

To install the required Python modules:

conda create --name py38_oktopk python=3.8

conda activate py38_oktopk

pip3 install pip==20.2.4

pip install -r requirements.txt

MPICC="cc -shared" pip install --no-binary=mpi4py mpi4py

git clone https://github.com/NVIDIA/apex

cd apex

pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Prepare Datasets

Cifar-10 for VGG

cd ./VGG/vgg_data

wget https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz

tar -zxvf cifar-10-python.tar.gz

AN4 for LSTM

cd ./LSTM/audio_data

wget https://www.dropbox.com/s/l5w4up20u5pfjxf/an4.zip

unzip an4.zip

Wikipedia for BERT

cd ./BERT/bert/bert_data/

Prepare the dataset according to the README file.

Run jobs

We run experiments on GPU clusters with SLURM job scheduler. To evaluate the performance of Ok-Topk, Gaussiank, gtopk, topkA, topkDSA, and dense, run the jobs as follows.

To run VGG jobs

cd ./VGG

./sbatch_vgg_jobs.sh

To run LSTM jobs

cd ./LSTM

./sbatch_lstm_jobs.sh

To run BERT jobs

cd ./BERT/bert/

./sbatch_bert_jobs.sh

Publication

The work of Ok-Topk is pulished in PPoPP'22. DOI

License

See LICENSE.

Owner
Shigang Li
Shigang Li
The pytorch implementation of the paper "text-guided neural image inpainting" at MM'2020

TDANet: Text-Guided Neural Image Inpainting, MM'2020 (Oral) MM | ArXiv This repository implements the paper "Text-Guided Neural Image Inpainting" by L

LisaiZhang 75 Dec 22, 2022
Unofficial implementation of the paper: PonderNet: Learning to Ponder in TensorFlow

PonderNet-TensorFlow This is an Unofficial Implementation of the paper: PonderNet: Learning to Ponder in TensorFlow. Official PyTorch Implementation:

1 Oct 23, 2022
Python based Advanced AI Assistant

Knick is a virtual artificial intelligence project, fully developed in python. The objective of this project is to develop a virtual assistant that can handle our minor, intermediate as well as heavy

19 Nov 15, 2022
High dimensional black-box optimizer using Latent Action Monte Carlo Tree Search algorithm

LA-MCTS The code is based of paper Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search. Component LA-MCTS has thr

Meta Research 18 Oct 24, 2022
PyTorch implementation of Off-policy Learning in Two-stage Recommender Systems

Off-Policy-2-Stage This repo provides a PyTorch implementation of the MovieLens experiments for the following paper: Off-policy Learning in Two-stage

Jiaqi Ma 25 Dec 12, 2022
Efficient Householder transformation in PyTorch

Efficient Householder Transformation in PyTorch This repository implements the Householder transformation algorithm for calculating orthogonal matrice

Anton Obukhov 49 Nov 20, 2022
BBScan py3 - BBScan py3 With Python

BBScan_py3 This repository is forked from lijiejie/BBScan 1.5. I migrated the fo

baiyunfei 12 Dec 30, 2022
Kaggleship: Kaggle Notebooks

Kaggleship: Kaggle Notebooks This repository contains my Kaggle notebooks. They are generally about data science, machine learning, and deep learning.

Erfan Sobhaei 1 Jan 25, 2022
Artificial intelligence technology inferring issues and logically supporting facts from raw text

개요 비정형 텍스트를 학습하여 쟁점별 사실과 논리적 근거 추론이 가능한 인공지능 원천기술 Artificial intelligence techno

6 Dec 29, 2021
Official PyTorch code for the paper: "Point-Based Modeling of Human Clothing" (ICCV 2021)

Point-Based Modeling of Human Clothing Paper | Project page | Video This is an official PyTorch code repository of the paper "Point-Based Modeling of

Visual Understanding Lab @ Samsung AI Center Moscow 64 Nov 22, 2022
Implement the Pareto Optimizer and pcgrad to make a self-adaptive loss for multi-task

multi-task_losses_optimizer Implement the Pareto Optimizer and pcgrad to make a self-adaptive loss for multi-task 已经实验过了,不会有cuda out of memory情况 ##Par

14 Dec 25, 2022
基于Paddle框架的fcanet复现

fcanet-Paddle 基于Paddle框架的fcanet复现 fcanet 本项目基于paddlepaddle框架复现fcanet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 参考项目: frazerlin-fcanet 数据准备 本项目已挂

QuanHao Guo 7 Mar 07, 2022
"Segmenter: Transformer for Semantic Segmentation" reproduced via mmsegmentation

Segmenter-based-on-OpenMMLab "Segmenter: Transformer for Semantic Segmentation, arxiv 2105.05633." reproduced via mmsegmentation. We reproduce Segment

EricKani 22 Feb 24, 2022
A supplementary code for Editable Neural Networks, an ICLR 2020 submission.

Editable neural networks A supplementary code for Editable Neural Networks, an ICLR 2020 submission by Anton Sinitsin, Vsevolod Plokhotnyuk, Dmitry Py

Anton Sinitsin 32 Nov 29, 2022
FCOS: Fully Convolutional One-Stage Object Detection (ICCV'19)

FCOS: Fully Convolutional One-Stage Object Detection This project hosts the code for implementing the FCOS algorithm for object detection, as presente

Tian Zhi 3.1k Jan 05, 2023
Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models

Patch-Rotation(PatchRot) Patch Rotation: A Self-Supervised Auxiliary Task for Robustness and Accuracy of Supervised Models Submitted to Neurips2021 To

4 Jul 12, 2021
(AAAI2022) Style Mixing and Patchwise Prototypical Matching for One-Shot Unsupervised Domain Adaptive Semantic Segmentation

SM-PPM This is a Pytorch implementation of our paper "Style Mixing and Patchwise Prototypical Matching for One-Shot Unsupervised Domain Adaptive Seman

W-zx-Y 10 Dec 07, 2022
D2LV: A Data-Driven and Local-Verification Approach for Image Copy Detection

Facebook AI Image Similarity Challenge: Matching Track —— Team: imgFp This is the source code of our 3rd place solution to matching track of Image Sim

16 Dec 25, 2022
Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

Zero-shot-Fact-Verification-by-Claim-Generation This repository contains code and models for the paper: Zero-shot Fact Verification by Claim Generatio

Liangming Pan 47 Jan 01, 2023
Tidy interface to polars

tidypolars tidypolars is a data frame library built on top of the blazingly fast polars library that gives access to methods and functions familiar to

Mark Fairbanks 144 Jan 08, 2023