Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.

Last update: Dec 28, 2022

Overview

Nonuniform-to-Uniform Quantization

This repository contains the training code of N2UQ introduced in our CVPR 2022 paper: "Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation"

In this study, we propose a quantization method that can learn the non-uniform input thresholds to maintain the strong representation ability of nonuniform methods, while output uniform quantized levels to be hardware-friendly and efficient as the uniform quantization for model inference.

To train the quantized network with learnable input thresholds, we introduce a generalized straight-through estimator (G-STE) for intractable backward derivative calculation w.r.t. threshold parameters.

The formula for N2UQ is simply as follows,

Forward pass:

Backward pass:

Moreover, we proposed L1 norm based entropy preserving weight regularization for weight quantization.

Citation

If you find our code useful for your research, please consider citing:

@inproceedings{liu2022nonuniform,
  title={Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation},
  author={Liu, Zechun and Cheng, Kwang-Ting and Huang, Dong and Xing, Eric and Shen, Zhiqiang},
  journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2022}
}

Run

1. Requirements:

python 3.6, pytorch 1.7.1, torchvision 0.8.2
gdown

2. Data:

Download ImageNet dataset

3. Pretrained Models:

pip install gdown # gdown will automatically download the models
If gdown doesn't work, you may need to manually download the pretrained models and put them in the correponding ./models/ folder.

4. Steps to run:

(1) For ResNet architectures:

Change directory to ./resnet/
Run bash run.sh architecture n_bits quantize_downsampling
E.g., bash run.sh resnet18 2 0 for quantize resnet18 to 2-bit without quantizing downsampling layers

(2) For MobileNet architectures:

Change directory to ./mobilenetv2/
Run bash run.sh

Models

1. ResNet

Network	Methods	W2/A2	W3/A3	W4/A4
ResNet-18
	PACT	64.4	68.1	69.2
	DoReFa-Net	64.7	67.5	68.1
	LSQ	67.6	70.2	71.1
	N2UQ	69.4 Model-Res18-2bit	71.9 Model-Res18-3bit	72.9 Model-Res18-4bit
	N2UQ *	69.7 Model-Res18-2bit	72.1 Model-Res18-3bit	73.1 Model-Res18-4bit
ResNet-34
	LSQ	71.6	73.4	74.1
	N2UQ	73.3 Model-Res34-2bit	75.2 Model-Res34-3bit	76.0 Model-Res34-4bit
	N2UQ *	73.4 Model-Res34-2bit	75.3 Model-Res34-3bit	76.1 Model-Res34-4bit
ResNet-50
	PACT	64.4	68.1	69.2
	LSQ	67.6	70.2	71.1
	N2UQ	75.8 Model-Res50-2bit	77.5 Model-Res50-3bit	78.0 Model-Res50-4bit
	N2UQ *	76.4 Model-Res50-2bit	77.6 Model-Res50-3bit	78.0 Model-Res50-4bit

Note that N2UQ without * denotes quantizing all the convolutional layers except the first input convolutional layer.

N2UQ with * denotes quantizing all the convolutional layers except the first input convolutional layer and three downsampling layers.

W2/A2, W3/A3, W4/A4 denote the cases where the weights and activations are both quantized to 2 bits, 3 bits, and 4 bits, respectively.

2. MobileNet

Network	Methods	W4/A4
MobileNet-V2	N2UQ	72.1 Model-MBV2-4bit

Contact

Zechun Liu, HKUST (zliubq at connect.ust.hk)

Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.

Related tags

Overview

Nonuniform-to-Uniform Quantization

Citation

Run

1. Requirements:

2. Data:

3. Pretrained Models:

4. Steps to run:

Models

1. ResNet

2. MobileNet

Contact

Owner

Zechun Liu

The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track.

Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)

This package implements the algorithms introduced in Smucler, Sapienza, and Rotnitzky (2020) to compute optimal adjustment sets in causal graphical models.

机器学习、深度学习、自然语言处理等人工智能基础知识总结。

Language Models for the legal domain in Spanish done @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

[CVPR 2021] MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition

Diagnostic tests for linguistic capacities in language models

PyTorch DepthNet Training on Still Box dataset

ArtEmis: Affective Language for Art

Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)

Fashion Entity Classification

Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering

Temporal Knowledge Graph Reasoning Triggered by Memories

Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation

UAV-Networks-Routing is a Python simulator for experimenting routing algorithms and mac protocols on unmanned aerial vehicle networks.

SAS output to EXCEL converter for Cornell/MIT Language and acquisition lab

GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

Official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR)

Model Zoo for AI Model Efficiency Toolkit