Code for paper "Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking"

Overview

model_based_energy_constrained_compression

Code for paper "Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking" (https://openreview.net/pdf?id=BylBr3C9K7)

@inproceedings{yang2018energy,
  title={Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking},
  author={Yang, Haichuan and Zhu, Yuhao and Liu, Ji},
  booktitle={ICLR},
  year={2019}
}

Prerequisites

Python (3.6)
PyTorch 1.0

To use the ImageNet dataset, download the dataset and move validation images to labeled subfolders (e.g., using https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh)

Training and testing

example

To run the training with energy constraint on AlexNet,

python energy_proj_train.py --net alexnet --dataset imagenet --datadir [imagenet-folder with train and val folders] --batch_size 128 --lr 1e-3 --momentum 0.9 --l2wd 1e-4 --proj_int 10 --logdir ./log/path-of-log --num_workers 8 --exp_bdecay --epochs 30 --distill 0.5 --nodp --budget 0.2

usage

usage: energy_proj_train.py [-h] [--net NET] [--dataset DATASET]
                            [--datadir DATADIR] [--batch_size BATCH_SIZE]
                            [--val_batch_size VAL_BATCH_SIZE]
                            [--num_workers NUM_WORKERS] [--epochs EPOCHS]
                            [--lr LR] [--xlr XLR] [--l2wd L2WD]
                            [--xl2wd XL2WD] [--momentum MOMENTUM]
                            [--lr_decay LR_DECAY] [--lr_decay_e LR_DECAY_E]
                            [--lr_decay_add] [--proj_int PROJ_INT] [--nodp]
                            [--input_mask] [--randinit] [--pretrain PRETRAIN]
                            [--eval] [--seed SEED]
                            [--log_interval LOG_INTERVAL]
                            [--test_interval TEST_INTERVAL]
                            [--save_interval SAVE_INTERVAL] [--logdir LOGDIR]
                            [--distill DISTILL] [--budget BUDGET]
                            [--exp_bdecay] [--mgpu] [--skip1]

Model-Based Energy Constrained Training

optional arguments:
  -h, --help            show this help message and exit
  --net NET             network arch
  --dataset DATASET     dataset used in the experiment
  --datadir DATADIR     dataset dir in this machine
  --batch_size BATCH_SIZE
                        batch size for training
  --val_batch_size VAL_BATCH_SIZE
                        batch size for evaluation
  --num_workers NUM_WORKERS
                        number of workers for training loader
  --epochs EPOCHS       number of epochs to train
  --lr LR               learning rate
  --xlr XLR             learning rate for input mask
  --l2wd L2WD           l2 weight decay
  --xl2wd XL2WD         l2 weight decay (for input mask)
  --momentum MOMENTUM   momentum
  --proj_int PROJ_INT   how many batches for each projection
  --nodp                turn off dropout
  --input_mask          enable input mask
  --randinit            use random init
  --pretrain PRETRAIN   file to load pretrained model
  --eval                evaluate testset in the begining
  --seed SEED           random seed
  --log_interval LOG_INTERVAL
                        how many batches to wait before logging training
                        status
  --test_interval TEST_INTERVAL
                        how many epochs to wait before another test
  --save_interval SAVE_INTERVAL
                        how many epochs to wait before save a model
  --logdir LOGDIR       folder to save to the log
  --distill DISTILL     distill loss weight
  --budget BUDGET       energy budget (relative)
  --exp_bdecay          exponential budget decay
  --mgpu                enable using multiple gpus
  --skip1               skip the first W update
Owner
Haichuan Yang
Haichuan Yang
An optimizer that trains as fast as Adam and as good as SGD.

AdaBound An optimizer that trains as fast as Adam and as good as SGD, for developing state-of-the-art deep learning models on a wide variety of popula

LoLo 2.9k Dec 27, 2022
Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)

News SRU++, a new SRU variant, is released. [tech report] [blog] The experimental code and SRU++ implementation are available on the dev branch which

ASAPP Research 2.1k Jan 01, 2023
TorchShard is a lightweight engine for slicing a PyTorch tensor into parallel shards

TorchShard is a lightweight engine for slicing a PyTorch tensor into parallel shards. It can reduce GPU memory and scale up the training when the model has massive linear layers (e.g., ViT, BERT and

Kaiyu Yue 275 Nov 22, 2022
A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.

878 Dec 30, 2022
Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation.

PyTorch Implementation of Differentiable ODE Solvers This library provides ordinary differential equation (ODE) solvers implemented in PyTorch. Backpr

Ricky Chen 4.4k Jan 04, 2023
Pytorch bindings for Fortran

Pytorch bindings for Fortran

Dmitry Alexeev 46 Dec 29, 2022
PyTorch toolkit for biomedical imaging

farabio is a minimal PyTorch toolkit for out-of-the-box deep learning support in biomedical imaging. For further information, see Wikis and Docs.

San Askaruly 47 Dec 28, 2022
PyTorch Implementation of [1611.06440] Pruning Convolutional Neural Networks for Resource Efficient Inference

PyTorch implementation of [1611.06440 Pruning Convolutional Neural Networks for Resource Efficient Inference] This demonstrates pruning a VGG16 based

Jacob Gildenblat 836 Dec 26, 2022
Model summary in PyTorch similar to `model.summary()` in Keras

Keras style model.summary() in PyTorch Keras has a neat API to view the visualization of the model which is very helpful while debugging your network.

Shubham Chandel 3.7k Dec 29, 2022
higher is a pytorch library allowing users to obtain higher order gradients over losses spanning training loops rather than individual training steps.

higher is a library providing support for higher-order optimization, e.g. through unrolled first-order optimization loops, of "meta" aspects of these

Facebook Research 1.5k Jan 03, 2023
Learning Sparse Neural Networks through L0 regularization

Example implementation of the L0 regularization method described at Learning Sparse Neural Networks through L0 regularization, Christos Louizos, Max W

AMLAB 202 Nov 10, 2022
Tez is a super-simple and lightweight Trainer for PyTorch. It also comes with many utils that you can use to tackle over 90% of deep learning projects in PyTorch.

Tez: a simple pytorch trainer NOTE: Currently, we are not accepting any pull requests! All PRs will be closed. If you want a feature or something does

abhishek thakur 1.1k Jan 04, 2023
Code for paper "Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking"

model_based_energy_constrained_compression Code for paper "Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and

Haichuan Yang 16 Jun 15, 2022
Over9000 optimizer

Optimizers and tests Every result is avg of 20 runs. Dataset LR Schedule Imagenette size 128, 5 epoch Imagewoof size 128, 5 epoch Adam - baseline OneC

Mikhail Grankin 405 Nov 27, 2022
GPU-accelerated PyTorch implementation of Zero-shot User Intent Detection via Capsule Neural Networks

GPU-accelerated PyTorch implementation of Zero-shot User Intent Detection via Capsule Neural Networks This repository implements a capsule model Inten

Joel Huang 15 Dec 24, 2022
Kaldi-compatible feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd

Kaldi-compatible feature extraction with PyTorch, supporting CUDA, batch processing, chunk processing, and autograd

Fangjun Kuang 119 Jan 03, 2023
On the Variance of the Adaptive Learning Rate and Beyond

RAdam On the Variance of the Adaptive Learning Rate and Beyond We are in an early-release beta. Expect some adventures and rough edges. Table of Conte

Liyuan Liu 2.5k Dec 27, 2022
A simplified framework and utilities for PyTorch

Here is Poutyne. Poutyne is a simplified framework for PyTorch and handles much of the boilerplating code needed to train neural networks. Use Poutyne

GRAAL/GRAIL 534 Dec 17, 2022
Fast Discounted Cumulative Sums in PyTorch

TODO: update this README! Fast Discounted Cumulative Sums in PyTorch This repository implements an efficient parallel algorithm for the computation of

Daniel Povey 7 Feb 17, 2022
High-fidelity performance metrics for generative models in PyTorch

High-fidelity performance metrics for generative models in PyTorch

Vikram Voleti 5 Oct 24, 2021