Pytorch implementation of BRECQ, ICLR 2021

Related tags

Deep LearningBRECQ
Overview

BRECQ

Pytorch implementation of BRECQ, ICLR 2021

@inproceedings{
li&gong2021brecq,
title={BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction},
author={Yuhang Li and Ruihao Gong and Xu Tan and Yang Yang and Peng Hu and Qi Zhang and Fengwei Yu and Wei Wang and Shi Gu},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=POWv6hDd9XH}
}

Pretrained models

We provide all the pretrained models and they can be accessed via torch.hub

For example: use res18 = torch.hub.load('yhhhli/BRECQ', model='resnet18', pretrained=True) to get the pretrained ResNet-18 model.

If you encounter URLError when downloading the pretrained network, it's probably a network failure. An alternative way is to use wget to manually download the file, then move it to ~/.cache/torch/checkpoints, where the load_state_dict_from_url function will check before downloading it.

For example:

wget https://github.com/yhhhli/BRECQ/releases/download/v1.0/resnet50_imagenet.pth.tar 
mv resnet50_imagenet.pth.tar ~/.cache/torch/checkpoints

Usage

python main_imagenet.py --data_path PATN/TO/DATA --arch resnet18 --n_bits_w 2 --channel_wise --n_bits_a 4 --act_quant --test_before_calibration

You can get the following output:

Quantized accuracy before brecq: 0.13599999248981476
Weight quantization accuracy: 66.32799530029297
Full quantization (W2A4) accuracy: 65.21199798583984
Comments
  • how to reproduce zero data result?

    how to reproduce zero data result?

    as title.

    there is a bug: https://github.com/yhhhli/BRECQ/blob/da93abc4f7e3ef437b356a2df8a5ecd8c326556e/main_imagenet.py#L173

    args.batchsize should be args.workers

    opened by yyfcc17 6
  • why not quantize  the activation of  the last conv layer in a block

    why not quantize the activation of the last conv layer in a block

    Hi, Thanks for the release of your code. But I have one problem regarding the detail of the implementation. In quant_block.py, take the following code of ResNet-18 and ResNet-34 for example. The disable_act_quant is set True for conv2, which disables the quantization of the output of conv2.

    class QuantBasicBlock(BaseQuantBlock):
        """
        Implementation of Quantized BasicBlock used in ResNet-18 and ResNet-34.
        """
        def __init__(self, basic_block: BasicBlock, weight_quant_params: dict = {}, act_quant_params: dict = {}):
            super().__init__(act_quant_params)
            self.conv1 = QuantModule(basic_block.conv1, weight_quant_params, act_quant_params)
            self.conv1.activation_function = basic_block.relu1
            self.conv2 = QuantModule(basic_block.conv2, weight_quant_params, act_quant_params, disable_act_quant=True)
    
            # modify the activation function to ReLU
            self.activation_function = basic_block.relu2
    
            if basic_block.downsample is None:
                self.downsample = None
            else:
                self.downsample = QuantModule(basic_block.downsample[0], weight_quant_params, act_quant_params,
                                              disable_act_quant=True)
            # copying all attributes in original block
            self.stride = basic_block.stride
    

    It will cause a boost in accuracy, the following is the result I get use the your code and the same ImageNet dataset you used in the paper. [1] and [2] denotes the modification I did to the original code.

    image

    [1]: quant_block.py→QuantBasicBlock→__init__→self.conv2=QuantModule(... , disable_act_quant=True) self.downsample = QuantModule(basic_block.downsample[0], weight_quant_params, act_quant_params, disable_act_quant=True). Change from True to False; [2]: quant_block.py→QuantInvertedResidual→__init__→self.conv=nn.Sequential(..., QuantModule(... , disable_act_quant=True), change from True to False

    But I do not think it is applicable for most of NPUs, which do quantization of every output of conv layer. So why not quantize the activation of the last conv layer in a block? Is there any particular reason for this? Also, for the methods you compared with in your paper, have you checked whether they do the same thing as you do or not?

    opened by frankgt 3
  • disable act quantization is designed for convolution

    disable act quantization is designed for convolution

    Hi, Very impressive coding.

    There is a question about the quantization of activation values.

    In the code:

    disable act quantization is designed for convolution before elemental-wise operation,

    in that case, we apply activation function and quantization after ele-wise op.

    Why can it be replaced like this?

    Thanks

    opened by xiayizhan2017 2
  • How to deal with data parallel and distributed data parallel?

    How to deal with data parallel and distributed data parallel?

    On my eyes, your code is just running with single gpu while I need to test this code with multi-gpu for other implementations. I just want to check that you have ran your code using data parallel and distributed data parallel.

    opened by jang0977 2
  • What is the purpose for setting retain_graph=True?

    What is the purpose for setting retain_graph=True?

    https://github.com/yhhhli/BRECQ/blob/2888b29de0a88ece561ae2443defc758444e41c1/quant/block_recon.py#L91

    What is the purpose for setting retain_graph=True?

    opened by un-knight 2
  • Cannot reproduce the accuracy

    Cannot reproduce the accuracy

    Greetings,

    Really appreciate your open source contribution.

    However, it seems the accuracy mentioned in the paper cannot be reproduced applying the standard Imagenet. For instance, with the full precision model, I have tested Resnet 18 (70.186%), MobileNetv2(71.618%), which is slightly lower than the results from your paper (71.08, 72.49 respectively).

    Have you utilized any preprocessing techniques other than imagenet.build_imagenet_data?

    Thanks

    opened by mike-zyz 2
  • suggest replacing .view with .reshape in accuracy() function

    suggest replacing .view with .reshape in accuracy() function

    Got an error:

    Traceback (most recent call last):
      File "main_imagenet.py", line 198, in <module>
        print('Quantized accuracy before brecq: {}'.format(validate_model(test_loader, qnn)))
      File "/home/xxxx/anaconda3/envs/torch/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
        return func(*args, **kwargs)
      File "main_imagenet.py", line 108, in validate_model
        acc1, acc5 = accuracy(output, target, topk=(1, 5))
      File "main_imagenet.py", line 77, in accuracy
        correct_k = correct[:k].view(-1).float().sum(0, keepdim=True)
    RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
    

    So suggest replacing .view with .reshape in accuracy() function.

    opened by un-knight 1
  • channel_wise quantization

    channel_wise quantization

    Hi, nice idea for quantizaton But it seems that the paper(not include the appendix) did not point that it is channel-wise quantization. however, the code showed it is. As we know, it is of course that channel-wise quntization would outperform layer-wise quantization. So, maybe it's hard to say that the performance of your method is close to QAT

    opened by shiyuetianqiang 1
  • Some questions about implementation details

    Some questions about implementation details

    Hello, thank you for an interesting paper and nice code.

    I have two questions concerning implementation details.

    1. Does the "one-by-one" block reconstruction mentioned in the paper mean that input to each block comes from already quantized preceding blocks, i.e. each block may correct quantization errors coming from previous blocks? Or maybe input to each block is collected from the full-precision model?
    2. Am I correct in my understanding that in block-wise reconstruction objective you use gradients for each object in calibration sample independently (i.e. no gradient averaging or smth, like in Adam mentioned on the paper)? Besides, what is happening here in data_utils.py, why do you add 1.0 to the gradients?
    cached_grads = cached_grads.abs() + 1.0
    # scaling to make sure its mean is 1
    # cached_grads = cached_grads * torch.sqrt(cached_grads.numel() / cached_grads.pow(2).sum())
    

    Thank you for your time and consideration!

    opened by AndreevP 0
  • Quantization doesn't work?

    Quantization doesn't work?

    Hi,

    So I tried running your code on CIFAR-10 with a pre-trained ResNet50 model. I've attached the code below. My accuracy however does not come nearly as close to the float model which is around 93% but after quanitzation: I get:

    • Accuracy of the network on the 10000 test images: 10.0 % top5: 52.28 %

    Please help me with this. The code is inside the zip file.

    main_cifar.zip s

    opened by praneet195 0
  • 在使用论文中提出的Fisher-diag方式进行Hessian估计时会提示Trying to backward through the graph a second time

    在使用论文中提出的Fisher-diag方式进行Hessian估计时会提示Trying to backward through the graph a second time

    如文中所提出的Fisher-diag方式来估计Hessian矩阵,需要计算每一层pre-activation的梯度。但在实际代码运行时,save_grad_data中的cur_grad = get_grad(cali_data[i * batch_size:(i + 1) * batch_size])在执行到第二个batch的时候会报错Trying to backward through the graph a second time,第一个batch的数据并不会报错。不知道作者是否遇到过类似的情况?

    opened by ariescts 2
  • Cuda Error when launching example

    Cuda Error when launching example

    [email protected]:/path_to/BRECQ# python main_imagenet.py --data_path /path_to/IMAGENET_2012/ --arch resnet18 --n_bits_w 2 --channel_wise --n_bits_a 4 --act_quant --test_before_calibration You are using fake SyncBatchNorm2d who is actually the official BatchNorm2d ==> Using Pytorch Dataset Downloading: "https://github.com/yhhhli/BRECQ/releases/download/v1.0/resnet18_imagenet.pth.tar" to /root/.cache/torch/hub/checkpoints/resnet18_imagenet.pth.tar 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 44.6M/44.6M [00:27<00:00, 1.70MB/s] Traceback (most recent call last): File "main_imagenet.py", line 178, in cnn.cuda() File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 680, in cuda return self._apply(lambda t: t.cuda(device)) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 570, in _apply module._apply(fn) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 593, in _apply param_applied = fn(param) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 680, in return self._apply(lambda t: t.cuda(device)) RuntimeError: CUDA error: out of memory CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

    opened by L-ED 1
Owner
Yuhang Li
Research Intern at @SenseTime Group Limited
Yuhang Li
A Demo server serving Bert through ONNX with GPU written in Rust with <3

Demo BERT ONNX server written in rust This demo showcase the use of onnxruntime-rs on BERT with a GPU on CUDA 11 served by actix-web and tokenized wit

Xavier Tao 28 Jan 01, 2023
An abstraction layer for mathematical optimization solvers.

MathOptInterface Documentation Build Status Social An abstraction layer for mathematical optimization solvers. Replaces MathProgBase. Citing MathOptIn

JuMP-dev 284 Jan 04, 2023
Text2Art is an AI art generator powered with VQGAN + CLIP and CLIPDrawer models

Text2Art is an AI art generator powered with VQGAN + CLIP and CLIPDrawer models. You can easily generate all kind of art from drawing, painting, sketch, or even a specific artist style just using a t

Muhammad Fathy Rashad 643 Dec 30, 2022
ViDT: An Efficient and Effective Fully Transformer-based Object Detector

ViDT: An Efficient and Effective Fully Transformer-based Object Detector by Hwanjun Song1, Deqing Sun2, Sanghyuk Chun1, Varun Jampani2, Dongyoon Han1,

NAVER AI 262 Dec 27, 2022
"Graph Neural Controlled Differential Equations for Traffic Forecasting", AAAI 2022

Graph Neural Controlled Differential Equations for Traffic Forecasting Setup Python environment for STG-NCDE Install python environment $ conda env cr

Jeongwhan Choi 55 Dec 28, 2022
An SMPC companion library for Syft

SyMPC A library that extends PySyft with SMPC support SyMPC /ˈsɪmpəθi/ is a library which extends PySyft ≥0.3 with SMPC support. It allows computing o

Arturo Marquez Flores 0 Oct 13, 2021
[RSS 2021] An End-to-End Differentiable Framework for Contact-Aware Robot Design

DiffHand This repository contains the implementation for the paper An End-to-End Differentiable Framework for Contact-Aware Robot Design (RSS 2021). I

Jie Xu 60 Jan 04, 2023
Python package for dynamic system estimation of time series

PyDSE Toolset for Dynamic System Estimation for time series inspired by DSE. It is in a beta state and only includes ARMA models right now. Documentat

Blue Yonder GmbH 40 Oct 07, 2022
Simple data balancing baselines for worst-group-accuracy benchmarks.

BalancingGroups Code to replicate the experimental results from Simple data balancing baselines achieve competitive worst-group-accuracy. Replicating

Meta Research 29 Dec 02, 2022
Source code for paper "ATP: AMRize Than Parse! Enhancing AMR Parsing with PseudoAMRs" @NAACL-2022

ATP: AMRize Then Parse! Enhancing AMR Parsing with PseudoAMRs Hi this is the source code of our paper "ATP: AMRize Then Parse! Enhancing AMR Parsing w

Chen Liang 13 Nov 23, 2022
FNet Implementation with TensorFlow & PyTorch

FNet Implementation with TensorFlow & PyTorch. TensorFlow & PyTorch implementation of the paper "FNet: Mixing Tokens with Fourier Transforms". Overvie

Abdelghani Belgaid 1 Feb 12, 2022
BitPack is a practical tool to efficiently save ultra-low precision/mixed-precision quantized models.

BitPack is a practical tool that can efficiently save quantized neural network models with mixed bitwidth.

Zhen Dong 36 Dec 02, 2022
A curated list of awesome neural radiance fields papers

Awesome Neural Radiance Fields A curated list of awesome neural radiance fields papers, inspired by awesome-computer-vision. How to submit a pull requ

Yen-Chen Lin 3.9k Dec 27, 2022
Latent Execution for Neural Program Synthesis

Latent Execution for Neural Program Synthesis This repo provides the code to replicate the experiments in the paper Xinyun Chen, Dawn Song, Yuandong T

Xinyun Chen 16 Oct 02, 2022
ANN model for prediction a spatio-temporal distribution of supercooled liquid in mixed-phase clouds using Doppler cloud radar spectra.

VOODOO Revealing supercooled liquid beyond lidar attenuation Explore the docs » Report Bug · Request Feature Table of Contents About The Project Built

remsens-lim 2 Apr 28, 2022
Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation

Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation Paper Multi-Target Adversarial Frameworks for Domain Adaptation in

Valeo.ai 20 Jun 21, 2022
Pose estimation with MoveNet Lightning

Pose Estimation With MoveNet Lightning MoveNet is the TensorFlow pre-trained model that identifies 17 different key points of the human body. It is th

Yash Vora 2 Jan 04, 2022
Clinica is a software platform for clinical research studies involving patients with neurological and psychiatric diseases and the acquisition of multimodal data

Clinica Software platform for clinical neuroimaging studies Homepage | Documentation | Paper | Forum | See also: AD-ML, AD-DL ClinicaDL About The Proj

ARAMIS Lab 165 Dec 29, 2022
This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

Omnimatte in PyTorch This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effect

Erika Lu 728 Dec 28, 2022
PyTorch implementation of PSPNet

PSPNet with PyTorch Unofficial implementation of "Pyramid Scene Parsing Network" (https://arxiv.org/abs/1612.01105). This repository is just for caffe

Kazuto Nakashima 52 Nov 16, 2022