Torch-based tool for quantizing high-dimensional vectors using additive codebooks

Last update: Jan 07, 2023

Related tags

Overview

Trainable multi-codebook quantization

This repository implements a utility for use with PyTorch, and ideally GPUs, for training an efficient quantizer based on multiple single-byte codebooks. The prototypical scenario is that you have some distribution over vectors in some space, say, of dimension 512, that might come from a neural net embedding, and you want a means of encoding a vector into a short sequence of bytes (say, 4 or 8 bytes) that can be used to reconstruct the vector with minimal expected loss, measured as squared distance, i.e. squared l2 loss.

This repository provides Quantizer object that lets you do this quantization, and an associated QuantizerTrainer object that you can use to train the Quantizer. For example, you might invoke the QuantizerTrainer with 20,000 minibatches of vectors.

Usage

Installation

python3 setup.py install

Example

import torch
import quantization

trainer = quantization.QuantizerTrainer(dim=256, bytes_per_frame=4,
                                        device=torch.device('cuda'))
while not trainer.done():
   # let x be some tensor of shape (*, dim), that you will train on
   # (should not be the same on each minibatch)
   trainer.step(x)
quantizer = trainer.get_quantizer()

# let x be some tensor of shape (*, dim)..
encoded = quantizer.encode(x)  # (*, 4), dtype=uint8
x_approx = quantizer.decode(quantizer.encode(x))

To avoid versioning issues and so on, it may be easier to just include quantization.py in your repository directly (and add its requirements to your requirements.txt).

Torch-based tool for quantizing high-dimensional vectors using additive codebooks

Related tags

Overview

Trainable multi-codebook quantization

Usage

Installation

Example

Owner

Daniel Povey

A object detecting neural network powered by the yolo architecture and leveraging the PyTorch framework and associated libraries.

CL-Gym: Full-Featured PyTorch Library for Continual Learning

A PyTorch Implementation of "Watch Your Step: Learning Node Embeddings via Graph Attention" (NeurIPS 2018).

Sequence lineage information extracted from RKI sequence data repo

Object tracking using YOLO and a tracker(KCF, MOSSE, CSRT) in openCV

Implementation of ProteinBERT in Pytorch

Codes accompanying the paper "Learning Nearly Decomposable Value Functions with Communication Minimization" (ICLR 2020)

[ICLR 2021] "Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective" by Wuyang Chen, Xinyu Gong, Zhangyang Wang

Flower - A Friendly Federated Learning Framework

Collection of Docker images for ML/DL and video processing projects

Face Depixelizer based on "PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models" repository.

Notebooks for my "Deep Learning with TensorFlow 2 and Keras" course

Hyperopt for solving CIFAR-100 with a convolutional neural network (CNN) built with Keras and TensorFlow, GPU backend

Source Code for DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances (https://arxiv.org/pdf/2012.01775.pdf)

Writeups for the challenges from DownUnderCTF 2021

Detection of drones using their thermal signatures from thermal camera through YOLO-V3 based CNN with modifications to encapsulate drone motion

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021

PyTorch implementation for the paper Pseudo Numerical Methods for Diffusion Models on Manifolds

Bridging Vision and Language Model

Unofficial pytorch implementation of the paper "Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution"