Decorators for maximizing memory utilization with PyTorch & CUDA

Last update: May 02, 2022

Related tags

Overview

torch-max-mem

This package provides decorators for memory utilization maximization with PyTorch and CUDA by starting with a maximum parameter size and applying successive halving until no more out-of-memory exception occurs.

💪 Getting Started

Assume you have a function for batched computation of nearest neighbors using brute-force distance calculation.

import torch

def knn(x, y, batch_size, k: int = 3):
    return torch.cat(
        [
            torch.cdist(x[start : start + batch_size], y).topk(k=k, dim=1, largest=False).indices
            for start in range(0, x.shape[0], batch_size)
        ],
        dim=0,
    )

With torch_max_mem you can decorate this function to reduce the batch size until no more out-of-memory error occurs.

import torch
from torch_max_mem import maximize_memory_utilization


@maximize_memory_utilization(parameter_name="batch_size")
def knn(x, y, batch_size, k: int = 3):
    return torch.cat(
        [
            torch.cdist(x[start : start + batch_size], y).topk(k=k, dim=0, largest=False).indices
            for start in range(0, x.shape[0], batch_size)
        ],
        dim=0,
    )

In the code, you can now always pass the largest sensible batch size, e.g.,

x = torch.rand(100, 100, device="cuda")
y = torch.rand(200, 100, device="cuda")
knn(x, y, batch_size=x.shape[0])

🚀 Installation

The most recent release can be installed from PyPI with:

$ pip install torch_max_mem

The most recent code and data can be installed directly from GitHub with:

$ pip install git+https://github.com/mberr/torch-max-mem.git

To install in development mode, use the following:

$ git clone git+https://github.com/mberr/torch-max-mem.git
$ cd torch-max-mem
$ pip install -e .

👐 Contributing

Contributions, whether filing an issue, making a pull request, or forking, are appreciated. See CONTRIBUTING.md for more information on getting involved.

👋 Attribution

Parts of the logic have been developed with Laurent Vermue for PyKEEN.

⚖️ License

The code in this package is licensed under the MIT License.

🍪 Cookiecutter

This package was created with @audreyfeldroy's cookiecutter package using @cthoyt's cookiecutter-snekpack template.

🛠️ For Developers

See developer instrutions

The final section of the README is for if you want to get involved by making a code contribution.

🥼 Testing

After cloning the repository and installing tox with pip install tox, the unit tests in the tests/ folder can be run reproducibly with:

$ tox

Additionally, these tests are automatically re-run with each commit in a GitHub Action.

📖 Building the Documentation

$ tox -e docs

📦 Making a Release

After installing the package in development mode and installing tox with pip install tox, the commands for making a new release are contained within the finish environment in tox.ini. Run the following from the shell:

$ tox -e finish

This script does the following:

Uses Bump2Version to switch the version number in the setup.cfg and src/torch_max_mem/version.py to not have the -dev suffix
Packages the code in both a tar archive and a wheel
Uploads to PyPI using twine. Be sure to have a .pypirc file configured to avoid the need for manual input at this step
Push to GitHub. You'll need to make a release going with the commit where the version was bumped.
Bump the version to the next patch. If you made big changes and want to bump the version by minor, you can use tox -e bumpversion minor after.

Picasso: A CUDA-based Library for Deep Learning over 3D Meshes

The Picasso Library is intended for complex real-world applications with large-scale surfaces, while it also performs impressively on the small-scale applications over synthetic shape manifolds. We have upgraded the point cloud modules of SPH3D-GCN from homogeneous to heterogeneous representations, and included the upgraded modules into this latest work as well. We are happy to announce that the work is accepted to IEEE CVPR2021.

97 Dec 1, 2022

This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

Introduction This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures. @inproceedings{Wa

42 Jan 7, 2023

Example repository for custom C++/CUDA operators for TorchScript

Custom TorchScript Operators Example This repository contains examples for writing, compiling and using custom TorchScript operators. See here for the

106 Dec 14, 2022

Convert Python 3 code to CUDA code.

Py2CUDA Convert python code to CUDA. Usage To convert a python file say named py_file.py to CUDA, run python generate_cuda.py --file py_file.py --arch

3 Jul 14, 2021

This demo showcase the use of onnxruntime-rs with a GPU on CUDA 11 to run Bert in a data pipeline with Rust.

Demo BERT ONNX pipeline written in rust This demo showcase the use of onnxruntime-rs with a GPU on CUDA 11 to run Bert in a data pipeline with Rust. R

14 Dec 17, 2022

LightSeq is a high performance training and inference library for sequence processing and generation implemented in CUDA

LightSeq: A High Performance Library for Sequence Processing and Generation

2.5k Jan 6, 2023

CUDA Python Low-level Bindings

529 Jan 3, 2023

A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

What Dead simple python wrapper for Yolo V3 using AlexyAB's darknet fork. Works with CUDA 10.1 and OpenCV 4.1 or later (I use OpenCV master as of Jun

6 Jan 12, 2022

An addernet CUDA version

Training addernet accelerated by CUDA Usage cd adder_cuda python setup.py install cd .. python main.py Environment pytorch 1.10.0 CUDA 11.3 benchmark

4 Jun 20, 2022

Comments

Import error

When trying to run the example from the README, I currently get the following error

Traceback (most recent call last):
  File ".../torch_max_mem/tmp.py", line 2, in <module>
    from torch_max_mem import maximize_memory_utilization
ModuleNotFoundError: No module named 'torch_max_mem'

When I check pip list, the package name appears to be the stylized name

$ pip list | grep max
torch-max-mem     0.0.1.dev0 .../torch_max_mem/src

opened by mberr 2

Add simplified key hasher

This PR adds a simplification for creating hashers based on the values associated to a subse of keys without having to define a lambda or named function.

opened by mberr 1

Code fails for KEYWORD_ONLY params

The following snippet

from torch_max_mem import maximize_memory_utilization


@maximize_memory_utilization()
def func(a, *bs, batch_size: int):
    pass

raises an error

Traceback (most recent call last):
  File ".../tmp.py", line 5, in <module>
    def func(a, *bs, batch_size: int):
  File ".../venv/venv-cpu/lib/python3.8/site-packages/torch_max_mem/api.py", line 274, in __call__
    wrapped = maximize_memory_utilization_decorator(
  File ".../venv/venv-cpu/lib/python3.8/site-packages/torch_max_mem/api.py", line 150, in decorator_maximize_memory_utilization
    raise ValueError(f"{parameter_name} must be a keyword based parameter, but is {_parameter.kind}.")
ValueError: batch_size must be a keyword based parameter, but is KEYWORD_ONLY.

since _parameter.kind is KEYWORD_ONLY.

This is overly restrictive, since we only need keyword-based parameters.

opened by mberr 0

stateful decorator

Add a decorator which remembers to maximum parameter value for next time. Since this is handled internally, we do not need to expose the found parameter value to the outside, leaving the method signature unchanged.

opened by mberr 0

Releases(v0.0.4)

v0.0.4(Aug 18, 2022)
What's Changed

Fix ad hoc key hashing by @mberr in https://github.com/mberr/torch-max-mem/pull/7

Fix default value handling by @mberr in https://github.com/mberr/torch-max-mem/pull/8

Full Changelog: https://github.com/mberr/torch-max-mem/compare/v0.0.3...v0.0.4
Source code(tar.gz)
Source code(zip)
v0.0.3(Aug 18, 2022)
What's Changed

Fix keyword only params by @mberr in https://github.com/mberr/torch-max-mem/pull/6

Full Changelog: https://github.com/mberr/torch-max-mem/compare/v0.0.2...v0.0.3
Source code(tar.gz)
Source code(zip)
v0.0.2(May 6, 2022)
What's Changed

Add simplified key hasher by @mberr in https://github.com/mberr/torch-max-mem/pull/3

Update README & doc by @mberr in https://github.com/mberr/torch-max-mem/pull/4

Full Changelog: https://github.com/mberr/torch-max-mem/compare/v0.0.1...v0.0.2
Source code(tar.gz)
Source code(zip)
v0.0.1(Feb 1, 2022)

cf. https://pypi.org/project/torch-max-mem/0.0.1/
Source code(tar.gz)
Source code(zip)

Owner

Max Berrendorf

GitHub Repository https://torch-max-mem.readthedocs.io/en/latest/

A hand tracking demo made with mediapipe where you can control lights with pinching your fingers and moving your hand up/down.

HandTrackingBrightnessControl A hand tracking demo made with mediapipe where you can control lights with pinching your fingers and moving your hand up

19 Feb 12, 2022

SymPy-powered, Wolfram|Alpha-like answer engine totally in your browser, without backend computation

SymPy Beta SymPy Beta is a fork of SymPy Gamma. The purpose of this project is to run a SymPy-powered, Wolfram|Alpha-like answer engine totally in you

25 Dec 21, 2022

Data & Code for ACCENTOR Adding Chit-Chat to Enhance Task-Oriented Dialogues

ACCENTOR: Adding Chit-Chat to Enhance Task-Oriented Dialogues Overview ACCENTOR consists of the human-annotated chit-chat additions to the 23.8K dialo

69 Dec 29, 2022

Codes for the AAAI'22 paper "TransZero: Attribute-guided Transformer for Zero-Shot Learning"

TransZero [arXiv] This repository contains the testing code for the paper "TransZero: Attribute-guided Transformer for Zero-Shot Learning" accepted to

52 Jan 01, 2023

Official implementation of "A Shared Representation for Photorealistic Driving Simulators" in PyTorch.

A Shared Representation for Photorealistic Driving Simulators The official code for the paper: "A Shared Representation for Photorealistic Driving Sim

7 Oct 13, 2022

Code implementation of "Sparsity Probe: Analysis tool for Deep Learning Models"

Sparsity Probe: Analysis tool for Deep Learning Models This repository is a limited implementation of Sparsity Probe: Analysis tool for Deep Learning

3 Jun 09, 2021

A Number Recognition algorithm

Paddle-VisualAttention Results_Compared SVHN Dataset Methods Steps GPU Batch Size Learning Rate Patience Decay Step Decay Rate Training Speed (FPS) Ac

1 Nov 12, 2021

This repository contains the code for the ICCV 2019 paper "Occupancy Flow - 4D Reconstruction by Learning Particle Dynamics"

Occupancy Flow This repository contains the code for the project Occupancy Flow - 4D Reconstruction by Learning Particle Dynamics. You can find detail

189 Dec 29, 2022

Simple PyTorch implementations of Badnets on MNIST and CIFAR10.

75 Dec 13, 2022

Godot RL Agents is a fully Open Source packages that allows video game creators

Godot RL Agents The Godot RL Agents is a fully Open Source packages that allows video game creators, AI researchers and hobbiest the opportunity to le

326 Dec 30, 2022

Unofficial PyTorch implementation of MobileViT based on paper "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".

MobileViT RegNet Unofficial PyTorch implementation of MobileViT based on paper MOBILEVIT: LIGHT-WEIGHT, GENERAL-PURPOSE, AND MOBILE-FRIENDLY VISION TR

91 Dec 02, 2022

Decorators for maximizing memory utilization with PyTorch & CUDA

Related tags

Overview

torch-max-mem

💪 Getting Started

🚀 Installation

👐 Contributing

👋 Attribution

⚖️ License

🍪 Cookiecutter

🛠️ For Developers

🥼 Testing

📖 Building the Documentation

📦 Making a Release

You might also like...

Picasso: A CUDA-based Library for Deep Learning over 3D Meshes

This Repo is the official CUDA implementation of ICCV 2019 Oral paper for CARAFE: Content-Aware ReAssembly of FEatures

Example repository for custom C++/CUDA operators for TorchScript

Convert Python 3 code to CUDA code.

This demo showcase the use of onnxruntime-rs with a GPU on CUDA 11 to run Bert in a data pipeline with Rust.

LightSeq is a high performance training and inference library for sequence processing and generation implemented in CUDA

CUDA Python Low-level Bindings

A dead simple python wrapper for darknet that works with OpenCV 4.1, CUDA 10.1

An addernet CUDA version

Comments

Import error

Add simplified key hasher

Code fails for KEYWORD_ONLY params

stateful decorator

Releases(v0.0.4)

v0.0.4(Aug 18, 2022)

What's Changed

v0.0.3(Aug 18, 2022)

What's Changed

v0.0.2(May 6, 2022)

What's Changed

v0.0.1(Feb 1, 2022)

Owner

Max Berrendorf

A hand tracking demo made with mediapipe where you can control lights with pinching your fingers and moving your hand up/down.

SymPy-powered, Wolfram|Alpha-like answer engine totally in your browser, without backend computation

Data & Code for ACCENTOR Adding Chit-Chat to Enhance Task-Oriented Dialogues

Codes for the AAAI'22 paper "TransZero: Attribute-guided Transformer for Zero-Shot Learning"

Official implementation of "A Shared Representation for Photorealistic Driving Simulators" in PyTorch.

Code implementation of "Sparsity Probe: Analysis tool for Deep Learning Models"

A Number Recognition algorithm

This repository contains the code for the ICCV 2019 paper "Occupancy Flow - 4D Reconstruction by Learning Particle Dynamics"

Simple PyTorch implementations of Badnets on MNIST and CIFAR10.

Godot RL Agents is a fully Open Source packages that allows video game creators

Unofficial PyTorch implementation of MobileViT based on paper "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer".

Dual Attention Network for Scene Segmentation (CVPR2019)

Assessing syntactic abilities of BERT

FairFuzz: AFL extension targeting rare branches

A PyTorch implementation of SlowFast based on ICCV 2019 paper "SlowFast Networks for Video Recognition"

Our solution for SSN Invente 2021's Hackathon

A collection of SOTA Image Classification Models in PyTorch

E2EDNA2 - An automated pipeline for simulation of DNA aptamers complexed with small molecules and short peptides

Applying CLIP to Point Cloud Recognition.

ML From Scratch