The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

Overview

[ICLR 2022] The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training
Shiwei Liu, Tianlong Chen, Xiaohan Chen, Li Shen, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy

https://openreview.net/forum?id=VBZJ_3tz-t

Abstract: Random pruning is arguably the most naive way to attain sparsity in neural networks, but has been deemed uncompetitive by either post-training pruning or sparse training. In this paper, we focus on sparse training and highlight a perhaps counter-intuitive finding, that random pruning at initialization (PaI) can be quite powerful for the sparse training of modern neural networks. Without any delicate pruning criteria or carefully pursued sparsity structures, we empirically demonstrate that sparsely training a randomly pruned network from scratch can match the performance of its dense equivalent. There are two key factors that contribute to this revival: (i) the network sizes matter: as the original dense networks grow wider and deeper, the performance of training a randomly pruned sparse network will quickly grow to matching that of its dense equivalent, even at high sparsity ratios; (ii) appropriate layer-wise sparsity ratios can be pre-chosen for sparse training, which shows to be another important performance booster. Simple as it looks, a randomly pruned subnetwork of Wide ResNet-50 can be sparsely trained to match the accuracy of a dense Wide ResNet-50, on ImageNet. We also observed such randomly pruned networks outperform dense counterparts in other favorable aspects, such as out-of-distribution detection, uncertainty estimation, and adversarial robustness. Overall, our results strongly suggest there is larger-than-expected room for sparse training at scale, and the benefits of sparsity might be more universal beyond carefully designed pruning.

This code base is created by Shiwei Liu [email protected] during his Ph.D. at Eindhoven University of Technology.

Requirements

Python 3.6, PyTorch v1.5.1, and CUDA v10.2.

How to Run Experiments

[Training module] The training module is controlled by the following arguments:

  • --sparse - Enable sparse mode (remove this if want to train dense model)
  • --fix - Fix the sparse pattern during training (remove this if want to with dynamic sparse training)
  • --sparse-init - Type of sparse initialization. Choose from: uniform, uniform_plus, ERK, ERK_plus, ER, snip (snip ratio), GraSP (GraSP ratio)
  • --model (str) - cifar_resnet_A_B, where A is the depths and B is the width, e.g., cifar_resnet_20_32
  • --density (float) - density level (default 0.05)

CIFAR-10/100 Experiments

To train ResNet with various depths on CIFAR10/100:

for model in cifar_resnet_20 cifar_resnet_32 cifar_resnet_44 cifar_resnet_56 cifar_resnet_110 
do
    python main.py --sparse --seed 17 --sparse_init ERK --fix --lr 0.1 --density 0.05 --model $model --data cifar10 --epoch 160
done

To train ResNet with various depths on CIFAR10/100:

for model in cifar_resnet_20_8 cifar_resnet_20_16 cifar_resnet_20_24 
do
    python main.py --sparse --seed 17 --sparse_init ERK --fix --lr 0.1 --density 0.05 --model $model --data cifar10 --epoch 160
done

ImageNet Experiments

To train WideResNet50_2 on ImageNet with ERK_plus:

cd ImageNet
python $1multiproc.py --nproc_per_node 4 $1main.py --sparse_init ERK_plus --fc_density 1.0 --fix --fp16 --master_port 5556 -j 10 -p 500 --arch WideResNet50_2 -c fanin --label-smoothing 0.1 -b 192 --lr 0.4 --warmup 5 --epochs 100 --density 0.2 --static-loss-scale 256 $2 ../../../../../../data1/datasets/imagenet2012/ --save save/

Citation

if you find this repo is helpful, please cite

@inproceedings{
liu2022the,
title={The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training},
author={Shiwei Liu and Tianlong Chen and Xiaohan Chen and Li Shen and Decebal Constantin Mocanu and Zhangyang Wang and Mykola Pechenizkiy},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=VBZJ_3tz-t}
}
Owner
VITA
Visual Informatics Group @ University of Texas at Austin
VITA
A Number Recognition algorithm

Paddle-VisualAttention Results_Compared SVHN Dataset Methods Steps GPU Batch Size Learning Rate Patience Decay Step Decay Rate Training Speed (FPS) Ac

1 Nov 12, 2021
In real-world applications of machine learning, reliable and safe systems must consider measures of performance beyond standard test set accuracy

PixMix Introduction In real-world applications of machine learning, reliable and safe systems must consider measures of performance beyond standard te

Andy Zou 79 Dec 30, 2022
Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions

torch-imle Concise and self-contained PyTorch library implementing the I-MLE gradient estimator proposed in our NeurIPS 2021 paper Implicit MLE: Backp

UCL Natural Language Processing 249 Jan 03, 2023
Learning Synthetic Environments and Reward Networks for Reinforcement Learning

Learning Synthetic Environments and Reward Networks for Reinforcement Learning We explore meta-learning agent-agnostic neural Synthetic Environments (

AutoML-Freiburg-Hannover 16 Sep 02, 2022
Remote sensing change detection using PaddlePaddle

Change Detection Laboratory Developing and benchmarking deep learning-based remo

Lin Manhui 15 Sep 23, 2022
The comma.ai Calibration Challenge!

Welcome to the comma.ai Calibration Challenge! Your goal is to predict the direction of travel (in camera frame) from provided dashcam video. This rep

comma.ai 697 Jan 05, 2023
Pytorch implementation of the DeepDream computer vision algorithm

deep-dream-in-pytorch Pytorch (https://github.com/pytorch/pytorch) implementation of the deep dream (https://en.wikipedia.org/wiki/DeepDream) computer

102 Dec 05, 2022
TimeSHAP explains Recurrent Neural Network predictions.

TimeSHAP TimeSHAP is a model-agnostic, recurrent explainer that builds upon KernelSHAP and extends it to the sequential domain. TimeSHAP computes even

Feedzai 90 Dec 18, 2022
A benchmark for the task of translation suggestion

WeTS: A Benchmark for Translation Suggestion Translation Suggestion (TS), which provides alternatives for specific words or phrases given the entire d

zhyang 55 Dec 24, 2022
Dilated Convolution with Learnable Spacings PyTorch

Dilated-Convolution-with-Learnable-Spacings-PyTorch Ismail Khalfaoui Hassani Dilated Convolution with Learnable Spacings (abbreviated to DCLS) is a no

15 Dec 09, 2022
Modified prey-predator system - Modified prey–predator model describes the rate of change for each species by adding coupling terms.

Modified prey-predator system We aim to study the behaviors of the modified prey–predator model and establish the effects of several parameters that p

Seoyoung Oh 1 Jan 02, 2022
code for Multi-scale Matching Networks for Semantic Correspondence, ICCV

MMNet This repo is the official implementation of ICCV 2021 paper "Multi-scale Matching Networks for Semantic Correspondence.". Pre-requisite conda cr

joey zhao 25 Dec 12, 2022
A PyTorch re-implementation of the paper 'Exploring Simple Siamese Representation Learning'. Reproduced the 67.8% Top1 Acc on ImageNet.

Exploring simple siamese representation learning This is a PyTorch re-implementation of the SimSiam paper on ImageNet dataset. The results match that

Taojiannan Yang 72 Nov 09, 2022
Graph Convolutional Neural Networks with Data-driven Graph Filter (GCNN-DDGF)

Graph Convolutional Gated Recurrent Neural Network (GCGRNN) Improved from Graph Convolutional Neural Networks with Data-driven Graph Filter (GCNN-DDGF

Lei Lin 21 Dec 18, 2022
A Weakly Supervised Amodal Segmenter with Boundary Uncertainty Estimation

Paper Khoi Nguyen, Sinisa Todorovic "A Weakly Supervised Amodal Segmenter with Boundary Uncertainty Estimation", accepted to ICCV 2021 Our code is mai

Khoi Nguyen 5 Aug 14, 2022
Video Representation Learning by Recognizing Temporal Transformations. In ECCV, 2020.

Video Representation Learning by Recognizing Temporal Transformations [Project Page] Simon Jenni, Givi Meishvili, and Paolo Favaro. In ECCV, 2020. Thi

Simon Jenni 46 Nov 14, 2022
An Artificial Intelligence trying to drive a car by itself on a user created map

An Artificial Intelligence trying to drive a car by itself on a user created map

Akhil Sahukaru 17 Jan 13, 2022
toroidal - a lightweight transformer library for PyTorch

toroidal - a lightweight transformer library for PyTorch Toroidal transformers are of smaller size and lower weight than the more common E-I types. Th

MathInf GmbH 64 Jan 07, 2023
Official PyTorch(Geometric) implementation of DPGNN(DPGCN) in "Distance-wise Prototypical Graph Neural Network for Node Imbalance Classification"

DPGNN This repository is an official PyTorch(Geometric) implementation of DPGNN(DPGCN) in "Distance-wise Prototypical Graph Neural Network for Node Im

Yu Wang (Jack) 18 Oct 12, 2022
RCD: Relation Map Driven Cognitive Diagnosis for Intelligent Education Systems

RCD: Relation Map Driven Cognitive Diagnosis for Intelligent Education Systems This is our implementation for the paper: Weibo Gao, Qi Liu*, Zhenya Hu

BigData Lab @USTC 中科大大数据实验室 10 Oct 16, 2022