Pytorch implementation of BRECQ, ICLR 2021

Related tags

Deep LearningBRECQ
Overview

BRECQ

Pytorch implementation of BRECQ, ICLR 2021

@inproceedings{
li&gong2021brecq,
title={BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction},
author={Yuhang Li and Ruihao Gong and Xu Tan and Yang Yang and Peng Hu and Qi Zhang and Fengwei Yu and Wei Wang and Shi Gu},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=POWv6hDd9XH}
}

Pretrained models

We provide all the pretrained models and they can be accessed via torch.hub

For example: use res18 = torch.hub.load('yhhhli/BRECQ', model='resnet18', pretrained=True) to get the pretrained ResNet-18 model.

If you encounter URLError when downloading the pretrained network, it's probably a network failure. An alternative way is to use wget to manually download the file, then move it to ~/.cache/torch/checkpoints, where the load_state_dict_from_url function will check before downloading it.

For example:

wget https://github.com/yhhhli/BRECQ/releases/download/v1.0/resnet50_imagenet.pth.tar 
mv resnet50_imagenet.pth.tar ~/.cache/torch/checkpoints

Usage

python main_imagenet.py --data_path PATN/TO/DATA --arch resnet18 --n_bits_w 2 --channel_wise --n_bits_a 4 --act_quant --test_before_calibration

You can get the following output:

Quantized accuracy before brecq: 0.13599999248981476
Weight quantization accuracy: 66.32799530029297
Full quantization (W2A4) accuracy: 65.21199798583984
Comments
  • how to reproduce zero data result?

    how to reproduce zero data result?

    as title.

    there is a bug: https://github.com/yhhhli/BRECQ/blob/da93abc4f7e3ef437b356a2df8a5ecd8c326556e/main_imagenet.py#L173

    args.batchsize should be args.workers

    opened by yyfcc17 6
  • why not quantize  the activation of  the last conv layer in a block

    why not quantize the activation of the last conv layer in a block

    Hi, Thanks for the release of your code. But I have one problem regarding the detail of the implementation. In quant_block.py, take the following code of ResNet-18 and ResNet-34 for example. The disable_act_quant is set True for conv2, which disables the quantization of the output of conv2.

    class QuantBasicBlock(BaseQuantBlock):
        """
        Implementation of Quantized BasicBlock used in ResNet-18 and ResNet-34.
        """
        def __init__(self, basic_block: BasicBlock, weight_quant_params: dict = {}, act_quant_params: dict = {}):
            super().__init__(act_quant_params)
            self.conv1 = QuantModule(basic_block.conv1, weight_quant_params, act_quant_params)
            self.conv1.activation_function = basic_block.relu1
            self.conv2 = QuantModule(basic_block.conv2, weight_quant_params, act_quant_params, disable_act_quant=True)
    
            # modify the activation function to ReLU
            self.activation_function = basic_block.relu2
    
            if basic_block.downsample is None:
                self.downsample = None
            else:
                self.downsample = QuantModule(basic_block.downsample[0], weight_quant_params, act_quant_params,
                                              disable_act_quant=True)
            # copying all attributes in original block
            self.stride = basic_block.stride
    

    It will cause a boost in accuracy, the following is the result I get use the your code and the same ImageNet dataset you used in the paper. [1] and [2] denotes the modification I did to the original code.

    image

    [1]: quant_block.py→QuantBasicBlock→__init__→self.conv2=QuantModule(... , disable_act_quant=True) self.downsample = QuantModule(basic_block.downsample[0], weight_quant_params, act_quant_params, disable_act_quant=True). Change from True to False; [2]: quant_block.py→QuantInvertedResidual→__init__→self.conv=nn.Sequential(..., QuantModule(... , disable_act_quant=True), change from True to False

    But I do not think it is applicable for most of NPUs, which do quantization of every output of conv layer. So why not quantize the activation of the last conv layer in a block? Is there any particular reason for this? Also, for the methods you compared with in your paper, have you checked whether they do the same thing as you do or not?

    opened by frankgt 3
  • disable act quantization is designed for convolution

    disable act quantization is designed for convolution

    Hi, Very impressive coding.

    There is a question about the quantization of activation values.

    In the code:

    disable act quantization is designed for convolution before elemental-wise operation,

    in that case, we apply activation function and quantization after ele-wise op.

    Why can it be replaced like this?

    Thanks

    opened by xiayizhan2017 2
  • How to deal with data parallel and distributed data parallel?

    How to deal with data parallel and distributed data parallel?

    On my eyes, your code is just running with single gpu while I need to test this code with multi-gpu for other implementations. I just want to check that you have ran your code using data parallel and distributed data parallel.

    opened by jang0977 2
  • What is the purpose for setting retain_graph=True?

    What is the purpose for setting retain_graph=True?

    https://github.com/yhhhli/BRECQ/blob/2888b29de0a88ece561ae2443defc758444e41c1/quant/block_recon.py#L91

    What is the purpose for setting retain_graph=True?

    opened by un-knight 2
  • Cannot reproduce the accuracy

    Cannot reproduce the accuracy

    Greetings,

    Really appreciate your open source contribution.

    However, it seems the accuracy mentioned in the paper cannot be reproduced applying the standard Imagenet. For instance, with the full precision model, I have tested Resnet 18 (70.186%), MobileNetv2(71.618%), which is slightly lower than the results from your paper (71.08, 72.49 respectively).

    Have you utilized any preprocessing techniques other than imagenet.build_imagenet_data?

    Thanks

    opened by mike-zyz 2
  • suggest replacing .view with .reshape in accuracy() function

    suggest replacing .view with .reshape in accuracy() function

    Got an error:

    Traceback (most recent call last):
      File "main_imagenet.py", line 198, in <module>
        print('Quantized accuracy before brecq: {}'.format(validate_model(test_loader, qnn)))
      File "/home/xxxx/anaconda3/envs/torch/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
        return func(*args, **kwargs)
      File "main_imagenet.py", line 108, in validate_model
        acc1, acc5 = accuracy(output, target, topk=(1, 5))
      File "main_imagenet.py", line 77, in accuracy
        correct_k = correct[:k].view(-1).float().sum(0, keepdim=True)
    RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
    

    So suggest replacing .view with .reshape in accuracy() function.

    opened by un-knight 1
  • channel_wise quantization

    channel_wise quantization

    Hi, nice idea for quantizaton But it seems that the paper(not include the appendix) did not point that it is channel-wise quantization. however, the code showed it is. As we know, it is of course that channel-wise quntization would outperform layer-wise quantization. So, maybe it's hard to say that the performance of your method is close to QAT

    opened by shiyuetianqiang 1
  • Some questions about implementation details

    Some questions about implementation details

    Hello, thank you for an interesting paper and nice code.

    I have two questions concerning implementation details.

    1. Does the "one-by-one" block reconstruction mentioned in the paper mean that input to each block comes from already quantized preceding blocks, i.e. each block may correct quantization errors coming from previous blocks? Or maybe input to each block is collected from the full-precision model?
    2. Am I correct in my understanding that in block-wise reconstruction objective you use gradients for each object in calibration sample independently (i.e. no gradient averaging or smth, like in Adam mentioned on the paper)? Besides, what is happening here in data_utils.py, why do you add 1.0 to the gradients?
    cached_grads = cached_grads.abs() + 1.0
    # scaling to make sure its mean is 1
    # cached_grads = cached_grads * torch.sqrt(cached_grads.numel() / cached_grads.pow(2).sum())
    

    Thank you for your time and consideration!

    opened by AndreevP 0
  • Quantization doesn't work?

    Quantization doesn't work?

    Hi,

    So I tried running your code on CIFAR-10 with a pre-trained ResNet50 model. I've attached the code below. My accuracy however does not come nearly as close to the float model which is around 93% but after quanitzation: I get:

    • Accuracy of the network on the 10000 test images: 10.0 % top5: 52.28 %

    Please help me with this. The code is inside the zip file.

    main_cifar.zip s

    opened by praneet195 0
  • 在使用论文中提出的Fisher-diag方式进行Hessian估计时会提示Trying to backward through the graph a second time

    在使用论文中提出的Fisher-diag方式进行Hessian估计时会提示Trying to backward through the graph a second time

    如文中所提出的Fisher-diag方式来估计Hessian矩阵,需要计算每一层pre-activation的梯度。但在实际代码运行时,save_grad_data中的cur_grad = get_grad(cali_data[i * batch_size:(i + 1) * batch_size])在执行到第二个batch的时候会报错Trying to backward through the graph a second time,第一个batch的数据并不会报错。不知道作者是否遇到过类似的情况?

    opened by ariescts 2
  • Cuda Error when launching example

    Cuda Error when launching example

    [email protected]:/path_to/BRECQ# python main_imagenet.py --data_path /path_to/IMAGENET_2012/ --arch resnet18 --n_bits_w 2 --channel_wise --n_bits_a 4 --act_quant --test_before_calibration You are using fake SyncBatchNorm2d who is actually the official BatchNorm2d ==> Using Pytorch Dataset Downloading: "https://github.com/yhhhli/BRECQ/releases/download/v1.0/resnet18_imagenet.pth.tar" to /root/.cache/torch/hub/checkpoints/resnet18_imagenet.pth.tar 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 44.6M/44.6M [00:27<00:00, 1.70MB/s] Traceback (most recent call last): File "main_imagenet.py", line 178, in cnn.cuda() File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 680, in cuda return self._apply(lambda t: t.cuda(device)) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 570, in _apply module._apply(fn) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 593, in _apply param_applied = fn(param) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 680, in return self._apply(lambda t: t.cuda(device)) RuntimeError: CUDA error: out of memory CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

    opened by L-ED 1
Owner
Yuhang Li
Research Intern at @SenseTime Group Limited
Yuhang Li
Resilient projection-based consensus actor-critic (RPBCAC) algorithm

Resilient projection-based consensus actor-critic (RPBCAC) algorithm We implement the RPBCAC algorithm with nonlinear approximation from [1] and focus

Martin Figura 5 Jul 12, 2022
(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain

Kaleido-BERT: Vision-Language Pre-training on Fashion Domain Mingchen Zhuge*, Dehong Gao*, Deng-Ping Fan#, Linbo Jin, Ben Chen, Haoming Zhou, Minghui

250 Jan 08, 2023
[NeurIPS 2021] The PyTorch implementation of paper "Self-Supervised Learning Disentangled Group Representation as Feature"

IP-IRM [NeurIPS 2021] The PyTorch implementation of paper "Self-Supervised Learning Disentangled Group Representation as Feature". Codes will be relea

Wang Tan 67 Dec 24, 2022
PyTorch implementation of adversarial patch

adversarial-patch PyTorch implementation of adversarial patch This is an implementation of the Adversarial Patch paper. Not official and likely to hav

Jamie Hayes 172 Nov 29, 2022
Implementation of our paper 'RESA: Recurrent Feature-Shift Aggregator for Lane Detection' in AAAI2021.

RESA PyTorch implementation of the paper "RESA: Recurrent Feature-Shift Aggregator for Lane Detection". Our paper has been accepted by AAAI2021. Intro

137 Jan 02, 2023
[CVPR 2022 Oral] Versatile Multi-Modal Pre-Training for Human-Centric Perception

Versatile Multi-Modal Pre-Training for Human-Centric Perception Fangzhou Hong1  Liang Pan1  Zhongang Cai1,2,3  Ziwei Liu1* 1S-Lab, Nanyang Technologic

Fangzhou Hong 96 Jan 03, 2023
The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question IntentionClassification Benchmark for Text-to-SQL"

TriageSQL The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question Intention Classification Benchmark for Text

Yusen Zhang 22 Nov 09, 2022
The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

SIGIR2021-EGLN The implement of paper "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization" Neural graph based Col

15 Dec 27, 2022
A scikit-learn compatible neural network library that wraps PyTorch

A scikit-learn compatible neural network library that wraps PyTorch. Resources Documentation Source Code Examples To see more elaborate examples, look

4.9k Dec 31, 2022
PyTorch Implementation for Fracture Detection in Wrist Bone X-ray Images

wrist-d PyTorch Implementation for Fracture Detection in Wrist Bone X-ray Images note: Paper: Under Review at MPDI Diagnostics Submission Date: Novemb

Fatih UYSAL 5 Oct 12, 2022
PyTorch ,ONNX and TensorRT implementation of YOLOv4

PyTorch ,ONNX and TensorRT implementation of YOLOv4

4.2k Jan 01, 2023
[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdhury, Yongxin Yan

Ayan Kumar Bhunia 44 Dec 12, 2022
Easy way to add GoogleMaps to Flask applications. maintainer: @getcake

Flask Google Maps Easy to use Google Maps in your Flask application requires Jinja Flask A google api key get here Contribute To contribute with the p

Flask Extensions 611 Dec 05, 2022
Torch Implementation of "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network"

Photo-Realistic-Super-Resoluton Torch Implementation of "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network" [Paper]

Harry Yang 199 Dec 01, 2022
Exploring the link between uncertainty estimates obtained via "exact" Bayesian inference and out-of-distribution (OOD) detection.

Uncertainty-based OOD detection Exploring the link between uncertainty estimates obtained by "exact" Bayesian inference and out-of-distribution (OOD)

Christian Henning 1 Nov 05, 2022
Kaggle Ultrasound Nerve Segmentation competition [Keras]

Ultrasound nerve segmentation using Keras (1.0.7) Kaggle Ultrasound Nerve Segmentation competition [Keras] #Install (Ubuntu {14,16}, GPU) cuDNN requir

179 Dec 28, 2022
AI Virtual Calculator: This is a simple virtual calculator based on Artificial intelligence.

AI Virtual Calculator: This is a simple virtual calculator that works with gestures using OpenCV. We will use our hand in the air to click on the calc

Md. Rakibul Islam 1 Jan 13, 2022
Face Recognition Attendance Project

Face-Recognition-Attendance-Project In This Project You will learn how to mark attendance using face recognition, Hello Guys This is Gautam Kumar, Thi

Gautam Kumar 1 Dec 03, 2022
S2s2net - Sentinel-2 Super-Resolution Segmentation Network

S2S2Net Sentinel-2 Super-Resolution Segmentation Network Getting started Install

Wei Ji 10 Nov 10, 2022
A Keras implementation of YOLOv3 (Tensorflow backend)

keras-yolo3 Introduction A Keras implementation of YOLOv3 (Tensorflow backend) inspired by allanzelener/YAD2K. Quick Start Download YOLOv3 weights fro

7.1k Jan 03, 2023