Implementation of Memory-Efficient Neural Networks with Multi-Level Generation, ICCV 2021

Overview

Memory-Efficient Multi-Level In-Situ Generation (MLG)

By Jiaqi Gu, Hanqing Zhu, Chenghao Feng, Mingjie Liu, Zixuan Jiang, Ray T. Chen and David Z. Pan.

This repo is the official implementation of "Towards Memory-Efficient Neural Networks via Multi-Level in situ Generation".

Introduction

MLG is a general and unified framework to trade expensive memory transactions with ultra-fast on-chip computations, directly translating to performance improvement. MLG explores the intrinsic correlations and bit-level redundancy within DNN kernels and propose a multi-level in situ generation mechanism with mixed-precision bases to achieve on-the-fly recovery of high-resolution parameters with minimum hardware overhead. MLG can boost the memory efficiency by 10-20× with comparable accuracy over four state-of-theart designs, when benchmarked on ResNet-18/DenseNet121/MobileNetV2/V3 with various tasks

flow

We explore intra-kernel and cross-kernel correlation in the accuracy (blue curve) and memory compression ratio (black curve) space with ResNet18/CIFAR-10. Our method generalizes prior DSConv and Blueprint Conv with better efficiency-performance trade-off. teaser

On CIFAR-10/100 and ResNet-18/DenseNet-121, we surpass prior low-rank methods with 10-20x less weight storage cost. exp

Dependencies

  • Python >= 3.6
  • pyutils >= 0.0.1. See pyutils for installation.
  • pytorch-onn >= 0.0.2. See pytorch-onn for installation.
  • Python libraries listed in requirements.txt
  • NVIDIA GPUs and CUDA >= 10.2

Structures

  • core/
    • models/
      • layers/
        • mlg_conv2d and mlg_linear: MLG layer definition
      • resnet.py: MLG-based ResNet definition
      • model_base.py: base model definition with all model utilities
    • builder.py: build training utilities
  • configs: YAML-based config files
  • scripts/: contains experiment scripts
  • train.py: training logic

Usage

  • Pretrain teacher model.
    > python3 train.py configs/cifar10/resnet18/train/pretrain.yml

  • Train MLG-based student model with L2-norm-based projection, knowledge distillation, multi-level orthonormality regularization, (Bi, Bo, qb, qu, qv) = (2, 44, 3, 6, 3).
    > python3 train.py configs/cifar10/resnet18/train/train.yml --teacher.checkpoint=path-to-teacher-ckpt --mlg.projection_alg=train --mlg.kd=1 --mlg.base_in=2 --mlg.base_out=44 --mlg.basis_bit=3 --mlg.coeff_in_bit=6 --mlg.coeff_out_bit=3 --criterion.ortho_weight_loss=0.05

  • Scripts for experiments are in ./scripts. For example, to run teacher model pretraining, you can write proper task setting in SCRIPT=scripts/cifar10/resnet18/pretrain.py and run
    > python3 SCRIPT

  • To train ML-based student model with KD and projection, you can write proper task setting in SCRIPT=scripts/cifar10/resnet18/train.py (need to provide the pretrained teacher checkpoint) and run
    > python3 SCRIPT

Citing Memory-Efficient Multi-Level In-Situ Generation (MLG)

@inproceedings{gu2021MLG,
  title={Towards Memory-Efficient Neural Networks via Multi-Level in situ Generation},
  author={Jiaqi Gu and Hanqing Zhu and Chenghao Feng and Mingjie Liu and Zixuan Jiang and Ray T. Chen and David Z. Pan},
  journal={International Conference on Computer Vision (ICCV)},
  year={2021}
}

Related Papers

  • Jiaqi Gu, Hanqing Zhu, Chenghao Feng, Mingjie Liu, Zixuan Jiang, Ray T. Chen, David Z. Pan, "Towards Memory-Efficient Neural Networks via Multi-Level in situ Generation," ICCV, 2021. [paper | slides]
Owner
Jiaqi Gu
PhD Student at UT Austin
Jiaqi Gu
Disease Informed Neural Networks (DINNs) — neural networks capable of learning how diseases spread, forecasting their progression, and finding their unique parameters (e.g. death rate).

DINN We introduce Disease Informed Neural Networks (DINNs) — neural networks capable of learning how diseases spread, forecasting their progression, a

19 Dec 10, 2022
Source code for paper "Deep Diffusion Models for Robust Channel Estimation", TBA.

diffusion-channels Source code for paper "Deep Diffusion Models for Robust Channel Estimation". Generic flow: Use 'matlab/main.mat' to generate traini

The University of Texas Computational Sensing and Imaging Lab 15 Dec 22, 2022
Reproducing Results from A Hybrid Approach to Targeting Social Assistance

title author date output Reproducing Results from A Hybrid Approach to Targeting Social Assistance Lendie Follett and Heath Henderson 12/28/2021 html_

Lendie Follett 0 Jan 06, 2022
This repository contains demos I made with the Transformers library by HuggingFace.

Transformers-Tutorials Hi there! This repository contains demos I made with the Transformers library by 🤗 HuggingFace. Currently, all of them are imp

3.5k Jan 01, 2023
Implementation of OpenAI paper with Simple Noise Scale on Fastai V2

README Implementation of OpenAI paper "An Empirical Model of Large-Batch Training" for Fastai V2. The code is based on the batch size finder implement

13 Dec 10, 2021
This repository contains the code for our paper VDA (public in EMNLP2021 main conference)

Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models This repository contains the code for our paper VDA (publ

RUCAIBox 13 Aug 06, 2022
Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis

Readme File for "Using Machine Learning to Test Causal Hypotheses in Conjoint Analysis" by Ham, Imai, and Janson. (2022) All scripts were written and

0 Jan 27, 2022
Robot Hacking Manual (RHM). From robotics to cybersecurity. Papers, notes and writeups from a journey into robot cybersecurity.

RHM: Robot Hacking Manual Download in PDF RHM v0.4 ┃ Read online The Robot Hacking Manual (RHM) is an introductory series about cybersecurity for robo

Víctor Mayoral Vilches 233 Dec 30, 2022
Convex optimization for fun and profit.

CFMM Optimal Routing This repository contains the code needed to generate the figures used in the paper Optimal Routing for Constant Function Market M

Guillermo Angeris 183 Dec 29, 2022
PyTorch implementations of the beta divergence loss.

Beta Divergence Loss - PyTorch Implementation This repository contains code for a PyTorch implementation of the beta divergence loss. Dependencies Thi

Billy Carson 7 Nov 09, 2022
Label-Free Model Evaluation with Semi-Structured Dataset Representations

Label-Free Model Evaluation with Semi-Structured Dataset Representations Prerequisites This code uses the following libraries Python 3.7 NumPy PyTorch

8 Oct 06, 2022
FuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space OptimizationFuseDream: Training-Free Text-to-Image Generationwith Improved CLIP+GAN Space Optimization

FuseDream This repo contains code for our paper (paper link): FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimizat

XCL 191 Dec 31, 2022
i-RevNet Pytorch Code

i-RevNet: Deep Invertible Networks Pytorch implementation of i-RevNets. i-RevNets define a family of fully invertible deep networks, built from a succ

Jörn Jacobsen 378 Dec 06, 2022
DeepSpamReview: Detection of Fake Reviews on Online Review Platforms using Deep Learning Architectures. Summer Internship project at CoreView Systems.

Detection of Fake Reviews on Online Review Platforms using Deep Learning Architectures Dataset: https://s3.amazonaws.com/fast-ai-nlp/yelp_review_polar

Ashish Salunkhe 37 Dec 17, 2022
Pytorch Implementation of "Contrastive Representation Learning for Exemplar-Guided Paraphrase Generation"

CRL_EGPG Pytorch Implementation of Contrastive Representation Learning for Exemplar-Guided Paraphrase Generation We use contrastive loss implemented b

YHR 25 Nov 14, 2022
Unofficial implementation of MUSIQ (Multi-Scale Image Quality Transformer)

MUSIQ: Multi-Scale Image Quality Transformer Unofficial pytorch implementation of the paper "MUSIQ: Multi-Scale Image Quality Transformer" (paper link

41 Jan 02, 2023
TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation Zhaoyun Yin, Pichao Wang, Fan Wang, Xianzhe Xu, Hanling Zhang, Hao Li

DamoCV 25 Dec 16, 2022
Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes, ICCV 2017

AdaptationSeg This is the Python reference implementation of AdaptionSeg proposed in "Curriculum Domain Adaptation for Semantic Segmentation of Urban

Yang Zhang 128 Oct 19, 2022
A collection of SOTA Image Classification Models in PyTorch

A collection of SOTA Image Classification Models in PyTorch

sithu3 85 Dec 30, 2022
Implementation of average- and worst-case robust flatness measures for adversarial training.

Relating Adversarially Robust Generalization to Flat Minima This repository contains code corresponding to the MLSys'21 paper: D. Stutz, M. Hein, B. S

David Stutz 13 Nov 27, 2022