Structured Data Gradient Pruning (SDGP)

Related tags

Deep Learningsdgp
Overview

Structured Data Gradient Pruning (SDGP)

Weight pruning is a technique to make Deep Neural Network (DNN) inference more computationally efficient by reducing the number of model parameters over the course of training. However, most weight pruning techniques generally does not speed up DNN training and can even require more iterations to reach model convergence. In this work, we propose a novel Structured Data Gradient Pruning (SDGP) method that can speed up training without impacting model convergence. This approach enforces a specific sparsity structure, where only N out of every M elements in a matrix can be nonzero, making it amenable to hardware acceleration. Modern accelerators such as the Nvidia A100 GPU support this type of structured sparsity for 2 nonzeros per 4 elements in a reduction. Assuming hardware support for 2:4 sparsity, our approach can achieve a 15-25% reduction in total training time without significant impact to performance.

Implementation Details

Check out sdgp.py for details on how the data gradients are pruned during backpropagation. To make the pruning more efficient under group-level sorting, we implemented our own CUDA kernel. This is tested only with CUDA 11.3 and PyTorch 1.10.2 using Python 3.9.

Training Configuration

Training generally follows the configuration details in the excellent ffcv library. To fit ImageNet in a system with 256 GB of RAM using the ffcv data loader, we decreased the image size and other settings from (500, 0.5, 90) which takes 337GB to (448, 0.60, 90) which takes 229GB. We did not observe any decrease in performance comapared to the results posted in the ffcv repository on either ResNet-18 or ResNet-50 using these slightly smaller images.

CIFAR-10

SDGP Prune Function Non zeros Group size Top-1 Acc. Config Checkpoint
None (dense) 4 4 95.3 link link
Random 2 4 94.5 link link
Magnitude 2 4 95.2 link link
Rescale Mag. 1 4 95.1 link link
Rescale Mag. 2 4 95.2 link link
Rescale Mag. 1 8 94.7 link link
Rescale Mag. 2 8 95.1 link link
Rescale Mag. 4 8 95.2 link link
Rescale Mag. 2 16 95.1 link link
Rescale Mag. 4 16 95.2 link link
Rescale Mag. 8 16 95.2 link link
Rescale Mag. 4 32 94.9 link link
Rescale Mag. 8 32 95.3 link link
Rescale Mag. 16 32 95.3 link link

ImageNet

Model SDGP Prune Function Non zeros Group size Top-1 Acc. Config Checkpoint
ResNet-18 None (dense) 4 4 71.4 link link
ResNet-18 Random 2 4 64.3 link link
ResNet-18 Magnitude 2 4 72.1 link link
ResNet-18 Rescale Mag. 2 4 72.4 link link
ResNet-50 None (dense) 4 4 78.1 link link
ResNet-50 Random 2 4 70.3 link link
ResNet-50 Magnitude 2 4 77.7 link link
ResNet-50 Rescale Mag. 2 4 77.6 link link
RegNetX-400MF None (dense) 4 4 73.3 link link
RegNetX-400MF Random 2 4 64.3 link link
RegNetX-400MF Magnitude 2 4 72.1 link link
RegNetX-400MF Rescale Mag. 2 4 72.4 link link
Owner
Bradley McDanel
Bradley McDanel
[ICML 2021] A fast algorithm for fitting robust decision trees.

GROOT: Growing Robust Trees Growing Robust Trees (GROOT) is an algorithm that fits binary classification decision trees such that they are robust agai

Cyber Analytics Lab 17 Nov 21, 2022
Co-GAIL: Learning Diverse Strategies for Human-Robot Collaboration

CoGAIL Table of Content Overview Installation Dataset Training Evaluation Trained Checkpoints Acknowledgement Citations License Overview This reposito

Jeremy Wang 29 Dec 24, 2022
Semantic Segmentation in Pytorch. Network include: FCN、FCN_ResNet、SegNet、UNet、BiSeNet、BiSeNetV2、PSPNet、DeepLabv3_plus、 HRNet、DDRNet

🚀 If it helps you, click a star! ⭐ Update log 2020.12.10 Project structure adjustment, the previous code has been deleted, the adjustment will be re-

Deeachain 269 Jan 04, 2023
Implementation of momentum^2 teacher

Momentum^2 Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning Requirements All experiments are done with python3.6, torch

jemmy li 121 Sep 26, 2022
Official implementation of Few-Shot and Continual Learning with Attentive Independent Mechanisms

Few-Shot and Continual Learning with Attentive Independent Mechanisms This repository is the official implementation of Few-Shot and Continual Learnin

Chikan_Huang 25 Dec 08, 2022
Fast, flexible and fun neural networks.

Brainstorm Discontinuation Notice Brainstorm is no longer being maintained, so we recommend using one of the many other,available frameworks, such as

IDSIA 1.3k Nov 21, 2022
Empowering journalists and whistleblowers

Onymochat Empowering journalists and whistleblowers Onymochat is an end-to-end encrypted, decentralized, anonymous chat application. You can also host

Samrat Dutta 19 Sep 02, 2022
BraTs-VNet - BraTS(Brain Tumour Segmentation) using V-Net

BraTS(Brain Tumour Segmentation) using V-Net This project is an approach to dete

Rituraj Dutta 7 Nov 27, 2022
An open software package to develop BCI based brain and cognitive computing technology for recognizing user's intention using deep learning

An open software package to develop BCI based brain and cognitive computing technology for recognizing user's intention using deep learning

deepbci 272 Jan 08, 2023
Implement slightly different caffe-segnet in tensorflow

Tensorflow-SegNet Implement slightly different (see below for detail) SegNet in tensorflow, successfully trained segnet-basic in CamVid dataset. Due t

Tseng Kuan Lun 364 Oct 27, 2022
Source code for "Progressive Transformers for End-to-End Sign Language Production" (ECCV 2020)

Progressive Transformers for End-to-End Sign Language Production Source code for "Progressive Transformers for End-to-End Sign Language Production" (B

58 Dec 21, 2022
MvtecAD unsupervised Anomaly Detection

MvtecAD unsupervised Anomaly Detection This respository is the unofficial implementations of DFR: Deep Feature Reconstruction for Unsupervised Anomaly

0 Feb 25, 2022
Code for our paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

SimCLS Code for our paper: "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021 1. How to Install Requirements

Yixin Liu 150 Dec 12, 2022
Pytorch implementation of MalConv

MalConv-Pytorch A Pytorch implementation of MalConv Desciprtion This is the implementation of MalConv proposed in Malware Detection by Eating a Whole

Alexander H. Liu 58 Oct 26, 2022
A geometric deep learning pipeline for predicting protein interface contacts.

A geometric deep learning pipeline for predicting protein interface contacts.

44 Dec 30, 2022
A plug-and-play library for neural networks written in Python

A plug-and-play library for neural networks written in Python!

Dimos Michailidis 2 Jul 16, 2022
Unofficial implementation of "TTNet: Real-time temporal and spatial video analysis of table tennis" (CVPR 2020)

TTNet-Pytorch The implementation for the paper "TTNet: Real-time temporal and spatial video analysis of table tennis" An introduction of the project c

Nguyen Mau Dung 438 Dec 29, 2022
Official code for "End-to-End Optimization of Scene Layout" -- including VAE, Diff Render, SPADE for colorization (CVPR 2020 Oral)

End-to-End Optimization of Scene Layout Code release for: End-to-End Optimization of Scene Layout CVPR 2020 (Oral) Project site, Bibtex For help conta

Andrew Luo 41 Dec 09, 2022
Automatic deep learning for image classification.

AutoDL AutoDL automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just a few line

wenqi 2 Oct 12, 2022