SPARSEDNN

**If you want to use this repo, please send me an email: [email protected], or raise a Github issue. **

Fast sparse deep learning on CPUs. This is the kernel library generator described in the paper: https://arxiv.org/abs/2101.07948

Python API: python fastsparse.py. Minimal required dependencies. Should work anywhere.

C++ API: check out driver_cpu.cpp, or run autotune_cpu_random.sh 128 128 128 0. This requires cnpy to read numpy files, so make sure that you can link to cnpy.

Python API has some bad overhead due to using ctypes. This is noticeable for smaller matrices but not really noticeable for large matrices. The benchmarkings done in the Arxiv paper was all done with the C++ API.

Work that is not yet open sourced: kernel generator for sparse convolutions (as described in the Arxiv paper) using implicit convolution, lightweight inference engine to get end-to-end results, sparse int8 kernels. If interested in any of this please email.

FAQs:

How does this compare to Neuralmagic? Last time I checked the deepsparse library does not allow you to run kernel-level benchmarks. If you care about end to end neural network acceleration, you should definitely go with Neuralmagic if they happen to support your model.
Future work? This is not exactly along the lines of my PhD thesis so I work on this sparingly. If you want to contribute to this repo you could make a Pytorch or Tensorflow custom op with the Python or C++ API. However it's unclear how gradients would work, and you will have to compile this op with the fixed sparsity pattern, something that the current Pytorch/Tensorflow frameworks might not support that well.

Fast sparse deep learning on CPUs

Related tags

Overview

SPARSEDNN

Owner

Ziheng Wang

Spatial Temporal Graph Convolutional Networks (ST-GCN) for Skeleton-Based Action Recognition in PyTorch

Implement of "Training deep neural networks via direct loss minimization" in PyTorch for 0-1 loss

Implementation of "Learning Multi-Granular Hypergraphs for Video-Based Person Re-Identification"

a dnn ai project to classify which food people are eating on audio recordings

AdamW optimizer for bfloat16 models in pytorch.

PyTorch implementation of SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching

Re-TACRED: Addressing Shortcomings of the TACRED Dataset

[NeurIPS'21] Projected GANs Converge Faster

Benchmarking the robustness of Spatial-Temporal Models

Physics-Aware Training (PAT) is a method to train real physical systems with backpropagation.

Efficient Sparse Attacks on Videos using Reinforcement Learning

Codes and models of NeurIPS2021 paper - DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks

Aiming at the common training datsets split, spectrum preprocessing, wavelength select and calibration models algorithm involved in the spectral analysis process

DI-HPC is an acceleration operator component for general algorithm modules in reinforcement learning algorithms

DFM: A Performance Baseline for Deep Feature Matching

General Virtual Sketching Framework for Vector Line Art (SIGGRAPH 2021)

Latent Network Models to Account for Noisy, Multiply-Reported Social Network Data

An official implementation of the paper Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers

MODNet: Trimap-Free Portrait Matting in Real Time

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.