Structured Data Gradient Pruning (SDGP)

Related tags

Deep Learningsdgp
Overview

Structured Data Gradient Pruning (SDGP)

Weight pruning is a technique to make Deep Neural Network (DNN) inference more computationally efficient by reducing the number of model parameters over the course of training. However, most weight pruning techniques generally does not speed up DNN training and can even require more iterations to reach model convergence. In this work, we propose a novel Structured Data Gradient Pruning (SDGP) method that can speed up training without impacting model convergence. This approach enforces a specific sparsity structure, where only N out of every M elements in a matrix can be nonzero, making it amenable to hardware acceleration. Modern accelerators such as the Nvidia A100 GPU support this type of structured sparsity for 2 nonzeros per 4 elements in a reduction. Assuming hardware support for 2:4 sparsity, our approach can achieve a 15-25% reduction in total training time without significant impact to performance.

Implementation Details

Check out sdgp.py for details on how the data gradients are pruned during backpropagation. To make the pruning more efficient under group-level sorting, we implemented our own CUDA kernel. This is tested only with CUDA 11.3 and PyTorch 1.10.2 using Python 3.9.

Training Configuration

Training generally follows the configuration details in the excellent ffcv library. To fit ImageNet in a system with 256 GB of RAM using the ffcv data loader, we decreased the image size and other settings from (500, 0.5, 90) which takes 337GB to (448, 0.60, 90) which takes 229GB. We did not observe any decrease in performance comapared to the results posted in the ffcv repository on either ResNet-18 or ResNet-50 using these slightly smaller images.

CIFAR-10

SDGP Prune Function Non zeros Group size Top-1 Acc. Config Checkpoint
None (dense) 4 4 95.3 link link
Random 2 4 94.5 link link
Magnitude 2 4 95.2 link link
Rescale Mag. 1 4 95.1 link link
Rescale Mag. 2 4 95.2 link link
Rescale Mag. 1 8 94.7 link link
Rescale Mag. 2 8 95.1 link link
Rescale Mag. 4 8 95.2 link link
Rescale Mag. 2 16 95.1 link link
Rescale Mag. 4 16 95.2 link link
Rescale Mag. 8 16 95.2 link link
Rescale Mag. 4 32 94.9 link link
Rescale Mag. 8 32 95.3 link link
Rescale Mag. 16 32 95.3 link link

ImageNet

Model SDGP Prune Function Non zeros Group size Top-1 Acc. Config Checkpoint
ResNet-18 None (dense) 4 4 71.4 link link
ResNet-18 Random 2 4 64.3 link link
ResNet-18 Magnitude 2 4 72.1 link link
ResNet-18 Rescale Mag. 2 4 72.4 link link
ResNet-50 None (dense) 4 4 78.1 link link
ResNet-50 Random 2 4 70.3 link link
ResNet-50 Magnitude 2 4 77.7 link link
ResNet-50 Rescale Mag. 2 4 77.6 link link
RegNetX-400MF None (dense) 4 4 73.3 link link
RegNetX-400MF Random 2 4 64.3 link link
RegNetX-400MF Magnitude 2 4 72.1 link link
RegNetX-400MF Rescale Mag. 2 4 72.4 link link
Owner
Bradley McDanel
Bradley McDanel
PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

VAENAR-TTS - PyTorch Implementation PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Keon Lee 67 Nov 14, 2022
Development Kit for the SoccerNet Challenge

SoccerNetv2-DevKit Welcome to the SoccerNet-V2 Development Kit for the SoccerNet Benchmark and Challenge. This kit is meant as a help to get started w

Silvio Giancola 117 Dec 30, 2022
A Real-Time-Strategy game for Deep Learning research

Description DeepRTS is a high-performance Real-TIme strategy game for Reinforcement Learning research. It is written in C++ for performance, but provi

Centre for Artificial Intelligence Research (CAIR) 156 Dec 19, 2022
Research on Tabular Deep Learning (Python package & papers)

Research on Tabular Deep Learning For paper implementations, see the section "Papers and projects". rtdl is a PyTorch-based package providing a user-f

Yura Gorishniy 510 Dec 30, 2022
Tool for working with Y-chromosome data from YFull and FTDNA

ycomp ycomp is a tool for working with Y-chromosome data from YFull and FTDNA. Run ycomp -h for information on how to use the program. Installation Th

Alexander Regueiro 2 Jun 18, 2022
[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing Figure: High-quality facial attributes editing results with InterFaceGA

GenForce: May Generative Force Be with You 1.3k Dec 29, 2022
A certifiable defense against adversarial examples by training neural networks to be provably robust

DiffAI v3 DiffAI is a system for training neural networks to be provably robust and for proving that they are robust. The system was developed for the

SRI Lab, ETH Zurich 202 Dec 13, 2022
This code is an implementation for Singing TTS.

MLP Singer This code is an implementation for Singing TTS. The algorithm is based on the following papers: Tae, J., Kim, H., & Lee, Y. (2021). MLP Sin

Heejo You 22 Dec 23, 2022
CHERRY is a python library for predicting the interactions between viral and prokaryotic genomes

CHERRY is a python library for predicting the interactions between viral and prokaryotic genomes. CHERRY is based on a deep learning model, which consists of a graph convolutional encoder and a link

Kenneth Shang 12 Dec 15, 2022
mPose3D, a mmWave-based 3D human pose estimation model.

mPose3D, a mmWave-based 3D human pose estimation model.

KylinChen 35 Nov 08, 2022
Code of our paper "Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning"

CCOP Code of our paper Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning Requirement Install OpenSelfSup Install Detectron2

Chenhongyi Yang 21 Dec 13, 2022
PyTorch implementation of SmoothGrad: removing noise by adding noise.

SmoothGrad implementation in PyTorch PyTorch implementation of SmoothGrad: removing noise by adding noise. Vanilla Gradients SmoothGrad Guided backpro

SSKH 143 Jan 05, 2023
Pure python implementation reverse-mode automatic differentiation

MiniGrad A minimal implementation of reverse-mode automatic differentiation (a.k.a. autograd / backpropagation) in pure Python. Inspired by Andrej Kar

Kenny Song 76 Sep 12, 2022
Unofficial implementation of the Involution operation from CVPR 2021

involution_pytorch Unofficial PyTorch implementation of "Involution: Inverting the Inherence of Convolution for Visual Recognition" by Li et al. prese

Rishabh Anand 46 Dec 07, 2022
Face and Pose detector that emits MQTT events when a face or human body is detected and not detected.

Face Detect MQTT Face or Pose detector that emits MQTT events when a face or human body is detected and not detected. I built this as an alternative t

Jacob Morris 38 Oct 21, 2022
Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Learning Opinion Summarizers by Selecting Informative Reviews This repository contains the codebase and the dataset for the corresponding EMNLP 2021

Arthur Bražinskas 39 Jan 01, 2023
“袋鼯麻麻——智能购物平台”能够精准地定位识别每一个商品

“袋鼯麻麻——智能购物平台”能够精准地定位识别每一个商品,并且能够返回完整地购物清单及顾客应付的实际商品总价格,极大地降低零售行业实际运营过程中巨大的人力成本,提升零售行业无人化、自动化、智能化水平。

thomas-yanxin 192 Jan 05, 2023
Implementation of Deep Deterministic Policy Gradiet Algorithm in Tensorflow

ddpg-aigym Deep Deterministic Policy Gradient Implementation of Deep Deterministic Policy Gradiet Algorithm (Lillicrap et al.arXiv:1509.02971.) in Ten

Steven Spielberg P 247 Dec 07, 2022
🐦 Opytimizer is a Python library consisting of meta-heuristic optimization techniques.

Opytimizer: A Nature-Inspired Python Optimizer Welcome to Opytimizer. Did you ever reach a bottleneck in your computational experiments? Are you tired

Gustavo Rosa 546 Dec 31, 2022
Adaptive Attention Span for Reinforcement Learning

Adaptive Transformers in RL Official implementation of Adaptive Transformers in RL In this work we replicate several results from Stabilizing Transfor

100 Nov 15, 2022