Spatial Sparse Convolution Library

Overview

SpConv: Spatially Sparse Convolution Library

Build Status

PyPI Install Downloads
CPU (Linux Only) PyPI Version pip install spconv pypi monthly download
CUDA 10.2 PyPI Version pip install spconv-cu102 pypi monthly download
CUDA 11.1 PyPI Version pip install spconv-cu111 pypi monthly download
CUDA 11.3 (Linux Only) PyPI Version pip install spconv-cu113 pypi monthly download
CUDA 11.4 PyPI Version pip install spconv-cu114 pypi monthly download

spconv is a project that provide heavily-optimized sparse convolution implementation with tensor core support. check benchmark to see how fast spconv 2.x runs.

Spconv 1.x code. We won't provide any support for spconv 1.x since it's deprecated. use spconv 2.x if possible.

Check spconv 2.x algorithm introduction to understand sparse convolution algorithm in spconv 2.x!

WARNING spconv < 2.1.4 users need to upgrade your version to 2.1.4, it fix a serious bug in SparseInverseConvXd.

Breaking changes in Spconv 2.x

Spconv 1.x users NEED READ THIS before using spconv 2.x.

Spconv 2.1 vs Spconv 1.x

  • spconv now can be installed by pip. see install section in readme for more details. Users don't need to build manually anymore!
  • Microsoft Windows support (only windows 10 has been tested).
  • fp32 (not tf32) training/inference speed is increased (+50~80%)
  • fp16 training/inference speed is greatly increased when your layer support tensor core (channel size must be multiple of 8).
  • int8 op is ready, but we still need some time to figure out how to run int8 in pytorch.
  • doesn't depend on pytorch binary, but you may need at least pytorch >= 1.6.0 to run spconv 2.x.
  • since spconv 2.x doesn't depend on pytorch binary (never in future), it's impossible to support torch.jit/libtorch inference.

Spconv 2.x Development and Roadmap

Spconv 2.2 development has started. See this issue for more details.

See dev plan. A complete guide of spconv development will be released soon.

Usage

Firstly you need to use import spconv.pytorch as spconv in spconv 2.x.

Then see this.

Don't forget to check performance guide.

Install

You need to install python >= 3.6 (>=3.7 for windows) first to use spconv 2.x.

You need to install CUDA toolkit first before using prebuilt binaries or build from source.

You need at least CUDA 10.2 to build and run spconv 2.x. We won't offer any support for CUDA < 10.2.

Prebuilt

We offer python 3.6-3.10 and cuda 10.2/11.1/11.3/11.4 prebuilt binaries for linux (manylinux).

We offer python 3.7-3.10 and cuda 10.2/11.1/11.4 prebuilt binaries for windows 10/11.

We will provide prebuilts for CUDA versions supported by latest pytorch release. For example, pytorch 1.10 provide cuda 10.2 and 11.3 prebuilts, so we provide them too.

For Linux users, you need to install pip >= 20.3 first to install prebuilt.

CUDA 11.1 will be removed in spconv 2.2 because pytorch 1.10 don't provide prebuilts for it.

pip install spconv for CPU only (Linux Only). you should only use this for debug usage, the performance isn't optimized due to manylinux limit (no omp support).

pip install spconv-cu102 for CUDA 10.2

pip install spconv-cu111 for CUDA 11.1

pip install spconv-cu113 for CUDA 11.3 (Linux Only)

pip install spconv-cu114 for CUDA 11.4

NOTE It's safe to have different minor cuda version between system and conda (pytorch) in Linux. for example, you can use spconv-cu114 with anaconda version of pytorch cuda 11.1 in a OS with CUDA 11.2 installed.

NOTE In Linux, you can install spconv-cuxxx without install CUDA to system! only suitable NVIDIA driver is required. for CUDA 11, we need driver >= 450.82.

Build from source for development (JIT, recommend)

The c++ code will be built automatically when you change c++ code in project.

For NVIDIA Embedded Platforms, you need to specify cuda arch before build: export CUMM_CUDA_ARCH_LIST="7.2" for xavier.

You need to remove cumm in requires section in pyproject.toml after install editable cumm and before install spconv due to pyproject limit (can't find editable installed cumm).

Linux

  1. uninstall spconv and cumm installed by pip
  2. install build-essential, install CUDA
  3. git clone https://github.com/FindDefinition/cumm, cd ./cumm, pip install -e .
  4. git clone https://github.com/traveller59/spconv, cd ./spconv, pip install -e .
  5. in python, import spconv and wait for build finish.

Windows

  1. uninstall spconv and cumm installed by pip
  2. install visual studio 2019 or newer. make sure C++ development component is installed. install CUDA
  3. set powershell script execution policy
  4. start a new powershell, run tools/msvc_setup.ps1
  5. git clone https://github.com/FindDefinition/cumm, cd ./cumm, pip install -e .
  6. git clone https://github.com/traveller59/spconv, cd ./spconv, pip install -e .
  7. in python, import spconv and wait for build finish.

Build wheel from source (not recommend, this is done in CI.)

You need to rebuild cumm first if you are build along a CUDA version that not provided in prebuilts.

Linux

  1. install build-essential, install CUDA
  2. run export SPCONV_DISABLE_JIT="1"
  3. run pip install pccm cumm wheel
  4. run python setup.py bdist_wheel+pip install dists/xxx.whl

Windows

  1. install visual studio 2019 or newer. make sure C++ development component is installed. install CUDA
  2. set powershell script execution policy
  3. start a new powershell, run tools/msvc_setup.ps1
  4. run $Env:SPCONV_DISABLE_JIT = "1"
  5. run pip install pccm cumm wheel
  6. run python setup.py bdist_wheel+pip install dists/xxx.whl

Note

The work is done when the author is an employee at Tusimple.

LICENSE

Apache 2.0

Owner
Yan Yan
Yan Yan
Rotary Transformer

[中文|English] Rotary Transformer Rotary Transformer is an MLM pre-trained language model with rotary position embedding (RoPE). The RoPE is a relative

325 Jan 03, 2023
FFTNet vocoder implementation

Unofficial Implementation of FFTNet vocode paper. implement the model. implement tests. overfit on a single batch (sanity check). linearize weights fo

Eren Gölge 81 Dec 08, 2022
GNN-based Recommendation Benchma

GRecX A Fair Benchmark for GNN-based Recommendation Preliminary Comparison DiffNet-Yelp dataset (featureless) Algo 73 Oct 17, 2022

Employs neural networks to classify images into four categories: ship, automobile, dog or frog

Neural Net Image Classifier Employs neural networks to classify images into four categories: ship, automobile, dog or frog Viterbi_1.py uses a classic

Riley Baker 1 Jan 18, 2022
TensorFlow implementation of Deep Reinforcement Learning papers

Deep Reinforcement Learning in TensorFlow TensorFlow implementation of Deep Reinforcement Learning papers. This implementation contains: [1] Playing A

Taehoon Kim 1.6k Jan 03, 2023
Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages"

Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages Code for the paper titled "Prabhupadavani: A Code-mixed Speech Translation Data

Ayush Daksh 12 Dec 01, 2022
Contains source code for the winning solution of the xView3 challenge

Winning Solution for xView3 Challenge This repository contains source code and pretrained models for my (Eugene Khvedchenya) solution to xView 3 Chall

Eugene Khvedchenya 51 Dec 30, 2022
code for our ECCV-2020 paper: Self-supervised Video Representation Learning by Pace Prediction

Video_Pace This repository contains the code for the following paper: Jiangliu Wang, Jianbo Jiao and Yunhui Liu, "Self-Supervised Video Representation

Jiangliu Wang 95 Dec 14, 2022
JAX-based neural network library

Haiku: Sonnet for JAX Overview | Why Haiku? | Quickstart | Installation | Examples | User manual | Documentation | Citing Haiku What is Haiku? Haiku i

DeepMind 2.3k Jan 04, 2023
This toolkit provides codes to download and pre-process the SLUE datasets, train the baseline models, and evaluate SLUE tasks.

slue-toolkit We introduce Spoken Language Understanding Evaluation (SLUE) benchmark. This toolkit provides codes to download and pre-process the SLUE

ASAPP Research 39 Sep 21, 2022
Yolo ros - YOLO-ROS for HUAWEI ATLAS200

YOLO-ROS YOLO-ROS for NVIDIA YOLO-ROS for HUAWEI ATLAS200, please checkout for b

ChrisLiu 5 Oct 18, 2022
Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

ReDet: A Rotation-equivariant Detector for Aerial Object Detection ReDet: A Rotation-equivariant Detector for Aerial Object Detection (CVPR2021), Jiam

csuhan 334 Dec 23, 2022
Perspective: Julia for Biologists

Perspective: Julia for Biologists 1. Examples Speed: Example 1 - Single cell data and network inference Domain: Single cell data Methodology: Network

Elisabeth Roesch 55 Dec 02, 2022
SphereFace: Deep Hypersphere Embedding for Face Recognition

SphereFace: Deep Hypersphere Embedding for Face Recognition By Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj and Le Song License SphereFa

Weiyang Liu 1.5k Dec 29, 2022
Very large and sparse networks appear often in the wild and present unique algorithmic opportunities and challenges for the practitioner

Sparse network learning with snlpy Very large and sparse networks appear often in the wild and present unique algorithmic opportunities and challenges

Andrew Stolman 1 Apr 30, 2021
Linescanning - Package for (pre)processing of anatomical and (linescanning) fMRI data

line scanning repository This repository contains all of the tools used during the acquisition and postprocessing of line scanning data at the Spinoza

Jurjen Heij 4 Sep 14, 2022
An Easy-to-use, Modular and Prolongable package of deep-learning based Named Entity Recognition Models.

DeepNER An Easy-to-use, Modular and Prolongable package of deep-learning based Named Entity Recognition Models. This repository contains complex Deep

Derrick 9 May 30, 2022
Source code for our EMNLP'21 paper 《Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning》

Child-Tuning Source code for EMNLP 2021 Long paper: Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning. 1. Environ

46 Dec 12, 2022
Python interface for the DIGIT tactile sensor

DIGIT-INTERFACE Python interface for the DIGIT tactile sensor. For updates and discussions please join the #DIGIT channel at the www.touch-sensing.org

Facebook Research 35 Dec 22, 2022
Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

HaloNet - Pytorch Implementation of the Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones. This re

Phil Wang 189 Nov 22, 2022