Post-training Quantization for Neural Networks with Provable Guarantees

Last update: Nov 29, 2022

Related tags

Overview

Post-training Quantization for Neural Networks with Provable Guarantees

Authors: Jinjie Zhang ([email protected]), Yixuan Zhou ([email protected]) and Rayan Saab ([email protected])

Overview

This directory contains code necessary to run a post-training neural-network quantization method GPFQ, that is based on a greedy path-following mechanism. One can also use it to reproduce the experiment results in our paper "Post-training Quantization for Neural Networks with Provable Guarantees". In this paper, we also prove theoretical guarantees for the proposed method, that is, for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights – i.e., level of over-parametrization.

If you make use of this code or our quantization method in your work, please cite the following paper:

 @article{zhang2022posttraining,
     author = {Zhang, Jinjie and Zhou, Yixuan and Saab, Rayan},
     title = {Post-training Quantization for Neural Networks with Provable Guarantees},
     booktitle = {arXiv preprint arXiv:2201.11113},
     year = {2022}
   }

Note: The code is designed to work primarily with the ImageNet dataset. Due to the size of this dataset, it is likely one may need heavier computational resources than a local machine. Nevertheless, the experiments can be run, for example, using a cloud computation center, e.g. AWS. When we run this experiment, we use the m5.8xlarge EC2 instance with a disk space of 300GB.

Installing Dependencies

We assume a python version that is greater than 3.8.0 is installed in the user's machine. In the root directory of this repo, we provide a requirements.txt file for installing the python libraries that will be used in our code.

To install the necessary dependency, one can first start a virtual environment by doing the following:

python3 -m venv .venv
source .venv/bin/activate

The code above should activate a new python virtual environments.

Then one can make use of the requirements.txt by

pip3 install -r requirement.txt

This should install all the required dependencies of this project.

Obtaining ImageNet Dataset

In this project, we make use of the Imagenet dataset, in particular, we use the ILSVRC-2012 version.

To obtain the Imagenet dataset, one can submit a request through this link.

Once the dataset is obtained, place the .tar files for training set and validation set both under the data/ILSVRC2012 directory of this repo.

Then use the following procedure to unzip Imagenet dataset:

tar -xvf ILSVRC2012_img_train.tar && rm -f ILSVRC2012_img_train.tar
find . -name "*.tar" | while read NAME ; do mkdir -p "${NAME%.tar}"; tar -xvf "${NAME}" -C "${NAME%.tar}"; rm -f "${NAME}"; done
cd ..
# Extract the validation data and move images to subfolders:
tar -xvf ILSVRC2012_img_val.tar

Running Experiments

The implementation of the modified GPFQ in our paper is contained in quantization_scripts. Additionally, adhoc_quantization_scripts and retraining_scripts provide extra experiments and both of them are variants of the framework in quantization_scripts. adhoc_quantization_scripts contains heuristic modifications used to further improve the performance of GPFQ, such as bias correction, mixed precision, and unquantizing the last layer. retraining_scripts shows a quantization-aware training strategy that is designed to retrain the neural network after each layer is quantized.

In this section, we will give a guidance on running our code contained in quantization_scripts and the implementation of other two counterparts adhoc_quantization_scripts and retraining_scripts are very similar to quantization_scripts.

Before getting started, run in the root directory of the repo and run mkdir modelsto create a directory in which we will store the quantized model.
The entry point of the project starts with quantization_scripts/quantize.py. Once the file is opened, there is a section to set hyperparameters, for example, the model_name parameter, the number of bits/batch size used for quantization, the scalar of alphabets, the probability for subsampling in CNNs etc. Note that the model_name mentioned above should be the same as the model that you will quantize. After you selected a model_name and assuming you are still in the root directory of this repo, run mkdir models/{model_name}, where the {model_name} should be the python string that you provided for the model_name parameter in the quantize.py file. If the directory already exists, you can skip this step.
Then navigate to the logs directory and run python3 init_logs.py. This will prepare a log file which is used to store the results of the experiment.
Finally, open the quantization_scripts directory and run python3 quantize.py to start the experiment.

Post-training Quantization for Neural Networks with Provable Guarantees

Related tags

Overview

Post-training Quantization for Neural Networks with Provable Guarantees

Authors: Jinjie Zhang ([email protected]), Yixuan Zhou ([email protected]) and Rayan Saab ([email protected])

Overview

Installing Dependencies

Obtaining ImageNet Dataset

Running Experiments

Owner

Yixuan Zhou

[NeurIPS-2021] Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation

Pre-Training Graph Neural Networks for Cold-Start Users and Items Representation.

Learning Time-Critical Responses for Interactive Character Control

Code for the bachelors-thesis flaky fault localization

Statistical-Rethinking-with-Python-and-PyMC3 - Python/PyMC3 port of the examples in " Statistical Rethinking A Bayesian Course with Examples in R and Stan" by Richard McElreath

pytorch bert intent classification and slot filling

2021 CCF BDCI 全国信息检索挑战杯（CCIR-Cup）智能人机交互自然语言理解赛道第二名参赛解决方案

In this tutorial, you will perform inference across 10 well-known pre-trained object detectors and fine-tune on a custom dataset. Design and train your own object detector.

Type4Py: Deep Similarity Learning-Based Type Inference for Python

The official implementation of paper "Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks" (IJCV under review).

Experiments on continual learning from a stream of pretrained models.

NuPIC Studio is an all-in-one tool that allows users create a HTM neural network from scratch

Massively parallel Monte Carlo diffusion MR simulator written in Python.

Individual Tree Crown classification on WorldView-2 Images using Autoencoder -- Group 9 Weak learners - Final Project (Machine Learning 2020 Course)

This tutorial repository is to introduce the functionality of KGTK to first-time users

Code for reproducing our paper: LMSOC: An Approach for Socially Sensitive Pretraining

Clustering with variational Bayes and population Monte Carlo

This code is a near-infrared spectrum modeling method based on PCA and pls

A collection of differentiable SVD methods and also the official implementation of the ICCV21 paper "Why Approximate Matrix Square Root Outperforms Accurate SVD in Global Covariance Pooling?"

Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Post-training Quantization for Neural Networks with Provable Guarantees

Related tags

Overview

Post-training Quantization for Neural Networks with Provable Guarantees

Authors: Jinjie Zhang ([email protected]), Yixuan Zhou ([email protected]) and Rayan Saab ([email protected])

Overview

Installing Dependencies

Obtaining ImageNet Dataset

Running Experiments

Owner

Yixuan Zhou

[NeurIPS-2021] Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation

Pre-Training Graph Neural Networks for Cold-Start Users and Items Representation.

Learning Time-Critical Responses for Interactive Character Control

Code for the bachelors-thesis flaky fault localization

Statistical-Rethinking-with-Python-and-PyMC3 - Python/PyMC3 port of the examples in " Statistical Rethinking A Bayesian Course with Examples in R and Stan" by Richard McElreath

pytorch bert intent classification and slot filling

2021 CCF BDCI 全国信息检索挑战杯（CCIR-Cup）智能人机交互自然语言理解赛道第二名参赛解决方案

In this tutorial, you will perform inference across 10 well-known pre-trained object detectors and fine-tune on a custom dataset. Design and train your own object detector.

Type4Py: Deep Similarity Learning-Based Type Inference for Python

The official implementation of paper "Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks" (IJCV under review).

Experiments on continual learning from a stream of pretrained models.

NuPIC Studio is an all­-in-­one tool that allows users create a HTM neural network from scratch

Massively parallel Monte Carlo diffusion MR simulator written in Python.

Individual Tree Crown classification on WorldView-2 Images using Autoencoder -- Group 9 Weak learners - Final Project (Machine Learning 2020 Course)

This tutorial repository is to introduce the functionality of KGTK to first-time users

Code for reproducing our paper: LMSOC: An Approach for Socially Sensitive Pretraining

Clustering with variational Bayes and population Monte Carlo

This code is a near-infrared spectrum modeling method based on PCA and pls

A collection of differentiable SVD methods and also the official implementation of the ICCV21 paper "Why Approximate Matrix Square Root Outperforms Accurate SVD in Global Covariance Pooling?"

Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

NuPIC Studio is an all-in-one tool that allows users create a HTM neural network from scratch