Pytorch implementation of our paper under review -- 1xN Pattern for Pruning Convolutional Neural Networks

Last update: Nov 29, 2022

Related tags

Overview

1xN Pattern for Pruning Convolutional Neural Networks (paper) .

This is Pytorch re-implementation of "1xN Pattern for Pruning Convolutional Neural Networks". A more formal project will be released as soon as we are given the authority from Alibaba Group.

1) 1×N Block Pruning

Requirements

Python 3.7
Pytorch >= 1.0.1
CUDA = 10.0.0

Code Running

To reproduce our experiments, please use the following command:

python imagenet.py \
--gpus 0 \
--arch mobilenet_v1 (or mobilenet_v2 or mobilenet_v3_large or mobilenet_v3_small) \
--job_dir ./experiment/ \
--data_path [DATA_PATH] \
--pretrained_model [PRETRAIN_MODEL_PATH] \
--pr_target 0.5 \
--N 4 (or 2, 8, 16, 32) \
--conv_type BlockL1Conv \
--train_batch_size 256 \
--eval_batch_size 256 \
--rearrange \

Accuracy Performance

Table 1: Performance comparison of our 1×N block sparsity against weight pruning and filter pruning (p = 50%).

MobileNet-V1	Top-1 Acc.	Top-5 Acc.	Model Link
Weight Pruning	70.764	89.592	Pruned Model
Filter Pruning	65.348	86.264	Pruned Model
1 x 2 Block	70.281	89.370	Pruned Model
1 x 4 Block	70.052	89.056	Pruned Model
1 x 8 Block	69.908	89.027	Pruned Model
1 x 16 Block	69.559	88.933	Pruned Model
1 x 32 Block	69.541	88.801	Pruned Model

MobileNet-V2	Top-1 Acc.	Top-5 Acc.	Model Link
Weight Pruning	71.146	89.872	Pruned Model
Filter Pruning	66.730	87.190	Pruned Model
1 x 2 Block	70.233	89.417	Pruned Model
1 x 4 Block	60.706	89.165	Pruned Model
1 x 8 Block	69.372	88.862	Pruned Model
1 x 16 Block	69.352	88.708	Pruned Model
1 x 32 Block	68.762	88.425	Pruned Model

MobileNet-V3-small	Top-1 Acc.	Top-5 Acc.	Model Link
Weight Pruning	66.376	86.868	Pruned Model
Filter Pruning	59.054	81.713	Pruned Model
1 x 2 Block	65.380	86.060	Pruned Model
1 x 4 Block	64.465	85.495	Pruned Model
1 x 8 Block	64.101	85.274	Pruned Model
1 x 16 Block	63.126	84.203	Pruned Model
1 x 32 Block	62.881	83.982	Pruned Model

MobileNet-V3-large	Top-1 Acc.	Top-5 Acc.	Model Link
Weight Pruning	72.897	91.093	Pruned Model
Filter Pruning	69.137	89.097	Pruned Model
1 x 2 Block	72.120	90.677	Pruned Model
1 x 4 Block	71.935	90.458	Pruned Model
1 x 8 Block	71.478	90.163	Pruned Model
1 x 16 Block	71.112	90.129	Pruned Model
1 x 32 Block	70.769	89.696	Pruned Model

More links for pruned models under different pruning rates and their training logs can be found in MobileNet-V2 and ResNet-50.

Evaluate our models

To verify the performance of our pruned models, download our pruned models from the links provided above and run the following command:

python imagenet.py \
--gpus 0 \
--arch mobilenet_v1 (or mobilenet_v2 or mobilenet_v3_large or mobilenet_v3_small) \
--data_path [DATA_PATH] \
--conv_type DenseConv \
--evaluate [PRUNED_MODEL_PATH] \
--eval_batch_size 256 \

Arguments

optional arguments:
  -h, --help            show this help message and exit
  --gpus                Select gpu_id to use. default:[0]
  --data_path           The dictionary where the data is stored.
  --job_dir             The directory where the summaries will be stored.
  --resume              Load the model from the specified checkpoint.
  --pretrain_model      Path of the pre-trained model.
  --pruned_model        Path of the pruned model to evaluate.
  --arch                Architecture of model. For ImageNet :mobilenet_v1, mobilenet_v2, mobilenet_v3_small, mobilenet_v3_large
  --num_epochs          The num of epochs to train. default:180
  --train_batch_size    Batch size for training. default:256
  --eval_batch_size     Batch size for validation. default:100
  --momentum            Momentum for Momentum Optimizer. default:0.9
  --lr LR               Learning rate. default:1e-2
  --lr_decay_step       The iterval of learn rate decay for cifar. default:100 150
  --lr_decay_freq       The frequecy of learn rate decay for Imagenet. default:30
  --weight_decay        The weight decay of loss. default:4e-5
  --lr_type             lr scheduler. default: cos. optional:exp/cos/step/fixed
  --use_dali            If this parameter exists, use dali module to load ImageNet data (benefit in training acceleration).
  --conv_type           Importance criterion of filters. Default: BlockL1Conv. optional: BlockRandomConv, DenseConv
  --pr_target           Pruning rate. default:0.5
  --full                If this parameter exists, prune fully-connected layer.
  --N                   Consecutive N kernels for removal (see paper for details).
  --rearrange           If this parameter exists, filters will be rearranged (see paper for details).
  --export_onnx         If this parameter exists, export onnx model.

2）Filter Rearrangement

Table 2: Performance studies of our 1×N block sparsity with and without filter rearrangement (p=50%).

N = 2	Top-1 Acc.	Top-5 Acc.	Model Link
w/o Rearange	69.900	89.296	Pruned Model
Rearrange	70.233	89.417	Pruned Model

N = 4	Top-1 Acc.	Top-5 Acc.	Model Link
w/o Rearange	69.521	88.920	Pruned Model
Rearrange	69.579	88.944	Pruned Model

N = 8	Top-1 Acc.	Top-5 Acc.	Model Link
w/o Rearange	69.206	88.608	Pruned Model
Rearrange	69.372	88.862	Pruned Model

N = 16	Top-1 Acc.	Top-5 Acc.	Model Link
w/o Rearange	68.971	88.399	Pruned Model
Rearrange	69.352	88.708	Pruned Model

N = 32	Top-1 Acc.	Top-5 Acc.	Model Link
w/o Rearange	68.431	88.315	Pruned Model
Rearrange	68.762	88.425	Pruned Model

3）Encoding and Decoding Efficiency

Performance and latency comparison

Our sparse convolution implementation has been released to TVM community.

To verify the performance of our pruned models, convert onnx model and run the following command:

python model_tune.py \
--onnx_path [ONNX_MODEL_PATH] \
--bsr 4 \
--bsc 1 \
--sparsity 0.5

The detail tuning setting is referred to TVM.

4）Contact

Any problem regarding this code re-implementation, please contact the first author: [email protected] or the third author: [email protected].

Any problem regarding the sparse convolution implementation, please contact the second author: [email protected].

Pytorch implementation of our paper under review -- 1xN Pattern for Pruning Convolutional Neural Networks

Related tags

Overview

1xN Pattern for Pruning Convolutional Neural Networks (paper) .

1) 1×N Block Pruning

Requirements

Code Running

Accuracy Performance

Evaluate our models

Arguments

2）Filter Rearrangement

3）Encoding and Decoding Efficiency

Performance and latency comparison

4）Contact

Owner

Mingbao Lin (林明宝)

Code and project page for ICCV 2021 paper "DisUnknown: Distilling Unknown Factors for Disentanglement Learning"

Tutoriais publicados nas nossas redes sociais para obtenção de dados, análises simples e outras tarefas relevantes no mercado financeiro.

classify fashion-mnist dataset with pytorch

Code for Environment Inference for Invariant Learning (ICML 2020 UDL Workshop Paper)

FasterAI: A library to make smaller and faster models with FastAI.

Generalized Data Weighting via Class-level Gradient Manipulation

Code of the lileonardo team for the 2021 Emotion and Theme Recognition in Music task of MediaEval 2021

Pyramid Grafting Network for One-Stage High Resolution Saliency Detection. CVPR 2022

RefineMask (CVPR 2021)

CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images

Code repository for our paper regarding the L3D dataset.

Code for the paper BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks

Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning

Certified Patch Robustness via Smoothed Vision Transformers

Official repository of the AAAI'2022 paper "Contrast and Generation Make BART a Good Dialogue Emotion Recognizer"

Tool for working with Y-chromosome data from YFull and FTDNA

Official PyTorch implementation of "Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning" (AAAI 2021)

This repository contains the official implementation code of the paper Transformer-based Feature Reconstruction Network for Robust Multimodal Sentiment Analysis

UV matrix decompostion using movielens dataset

Invariant Causal Prediction for Block MDPs