This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model inference.

Overview

PyTorch Infer Utils

This package proposes simplified exporting pytorch models to ONNX and TensorRT, and also gives some base interface for model inference.

To install

git clone https://github.com/gorodnitskiy/pytorch_infer_utils.git
pip install /path/to/pytorch_infer_utils/

Export PyTorch model to ONNX

  • Check model for denormal weights to achieve better performance. Use load_weights_rounded_model func to load model with weights rounding:
    from pytorch_infer_utils import load_weights_rounded_model
    
    model = ModelClass()
    load_weights_rounded_model(
        model,
        "/path/to/model_state_dict",
        map_location=map_location
    )
    
  • Use ONNXExporter.torch2onnx method to export pytorch model to ONNX:
    from pytorch_infer_utils import ONNXExporter
    
    model = ModelClass()
    model.load_state_dict(
        torch.load("/path/to/model_state_dict", map_location=map_location)
    )
    model.eval()
    
    exporter = ONNXExporter()
    input_shapes = [-1, 3, 224, 224] # -1 means that is dynamic shape
    exporter.torch2onnx(model, "/path/to/model.onnx", input_shapes)
    
  • Use ONNXExporter.optimize_onnx method to optimize ONNX via onnxoptimizer:
    from pytorch_infer_utils import ONNXExporter
    
    exporter = ONNXExporter()
    exporter.optimize_onnx("/path/to/model.onnx", "/path/to/optimized_model.onnx")
    
  • Use ONNXExporter.optimize_onnx_sim method to optimize ONNX via onnx-simplifier. Be careful with onnx-simplifier not to lose dynamic shapes.
    from pytorch_infer_utils import ONNXExporter
    
    exporter = ONNXExporter()
    exporter.optimize_onnx_sim("/path/to/model.onnx", "/path/to/optimized_model.onnx")
    
  • Also, a method combined the above methods is available ONNXExporter.torch2optimized_onnx:
    from pytorch_infer_utils import ONNXExporter
    
    model = ModelClass()
    model.load_state_dict(
        torch.load("/path/to/model_state_dict", map_location=map_location)
    )
    model.eval()
    
    exporter = ONNXExporter()
    input_shapes = [-1, 3, -1, -1] # -1 means that is dynamic shape
    exporter.torch2optimized_onnx(model, "/path/to/model.onnx", input_shapes)
    
  • Other params that can be used in class initialization:
    • default_shapes: default shapes if dimension is dynamic, default = [1, 3, 224, 224]
    • onnx_export_params:
      • export_params: store the trained parameter weights inside the model file, default = True
      • do_constant_folding: whether to execute constant folding for optimization, default = True
      • input_names: the model's input names, default = ["input"]
      • output_names: the model's output names, default = ["output"]
      • opset_version: the ONNX version to export the model to, default = 11
    • onnx_optimize_params:
      • fixed_point: use fixed point, default = False
      • passes: optimization passes, default = [ "eliminate_deadend", "eliminate_duplicate_initializer", "eliminate_identity", "eliminate_if_with_const_cond", "eliminate_nop_cast", "eliminate_nop_dropout", "eliminate_nop_flatten", "eliminate_nop_monotone_argmax", "eliminate_nop_pad", "eliminate_nop_transpose", "eliminate_unused_initializer", "extract_constant_to_initializer", "fuse_add_bias_into_conv", "fuse_bn_into_conv", "fuse_consecutive_concats", "fuse_consecutive_log_softmax", "fuse_consecutive_reduce_unsqueeze", "fuse_consecutive_squeezes", "fuse_consecutive_transposes", "fuse_matmul_add_bias_into_gemm", "fuse_pad_into_conv", "fuse_transpose_into_gemm", "lift_lexical_references", "nop" ]

Export ONNX to TensorRT

  • Check TensorRT health via check_tensorrt_health func
  • Use TRTEngineBuilder.build_engine method to export ONNX to TensorRT:
    from pytorch_infer_utils import TRTEngineBuilder
    
    exporter = TRTEngineBuilder()
    # get engine by itself
    engine = exporter.build_engine("/path/to/model.onnx")
    # or save engine to /path/to/model.trt
    exporter.build_engine("/path/to/model.onnx", engine_path="/path/to/model.trt")
    
  • fp16_mode is available:
    from pytorch_infer_utils import TRTEngineBuilder
    
    exporter = TRTEngineBuilder()
    engine = exporter.build_engine("/path/to/model.onnx", fp16_mode=True)
    
  • int8_mode is available. It requires calibration_set of images as List[Any], load_image_func - func to correctly read and process images, max_image_shape - max image size as [C, H, W] to allocate correct size of memory:
    from pytorch_infer_utils import TRTEngineBuilder
    
    exporter = TRTEngineBuilder()
    engine = exporter.build_engine(
        "/path/to/model.onnx",
        int8_mode=True,
        calibration_set=calibration_set,
        max_image_shape=max_image_shape,
        load_image_func=load_image_func,
    )
    
  • Also, additional params for builder config builder.create_builder_config can be put to kwargs.
  • Other params that can be used in class initialization:
    • opt_shape_dict: optimal shapes, default = {'input': [[1, 3, 224, 224], [1, 3, 224, 224], [1, 3, 224, 224]]}
    • max_workspace_size: max workspace size, default = [1, 30]
    • stream_batch_size: batch size for forward network during transferring to int8, default = 100
    • cache_file: int8_mode cache filename, default = "model.trt.int8calibration"

Inference via onnxruntime on CPU and onnx_tensort on GPU

  • Base class ONNXWrapper __init__ has the structure as below:
    def __init__(
        self,
        onnx_path: str,
        gpu_device_id: Optional[int] = None,
        intra_op_num_threads: Optional[int] = 0,
        inter_op_num_threads: Optional[int] = 0,
    ) -> None:
        """
        :param onnx_path: onnx-file path, required
        :param gpu_device_id: gpu device id to use, default = 0
        :param intra_op_num_threads: ort_session_options.intra_op_num_threads,
            to let onnxruntime choose by itself is required 0, default = 0
        :param inter_op_num_threads: ort_session_options.inter_op_num_threads,
            to let onnxruntime choose by itself is required 0, default = 0
        :type onnx_path: str
        :type gpu_device_id: int
        :type intra_op_num_threads: int
        :type inter_op_num_threads: int
        """
        if gpu_device_id is None:
            import onnxruntime
    
            self.is_using_tensorrt = False
            ort_session_options = onnxruntime.SessionOptions()
            ort_session_options.intra_op_num_threads = intra_op_num_threads
            ort_session_options.inter_op_num_threads = inter_op_num_threads
            self.ort_session = onnxruntime.InferenceSession(
                onnx_path, ort_session_options
            )
    
        else:
            import onnx
            import onnx_tensorrt.backend as backend
    
            self.is_using_tensorrt = True
            model_proto = onnx.load(onnx_path)
            for gr_input in model_proto.graph.input:
                gr_input.type.tensor_type.shape.dim[0].dim_value = 1
    
            self.engine = backend.prepare(
                model_proto, device=f"CUDA:{gpu_device_id}"
            )
    
  • ONNXWrapper.run method assumes the use of such a structure:
    img = self._process_img_(img)
    if self.is_using_tensorrt:
        preds = self.engine.run(img)
    else:
        ort_inputs = {self.ort_session.get_inputs()[0].name: img}
        preds = self.ort_session.run(None, ort_inputs)
    
    preds = self._process_preds_(preds)
    

Inference via onnxruntime on CPU and TensorRT on GPU

  • Base class TRTWrapper __init__ has the structure as below:
    def __init__(
        self,
        onnx_path: Optional[str] = None,
        trt_path: Optional[str] = None,
        gpu_device_id: Optional[int] = None,
        intra_op_num_threads: Optional[int] = 0,
        inter_op_num_threads: Optional[int] = 0,
        fp16_mode: bool = False,
    ) -> None:
        """
        :param onnx_path: onnx-file path, default = None
        :param trt_path: onnx-file path, default = None
        :param gpu_device_id: gpu device id to use, default = 0
        :param intra_op_num_threads: ort_session_options.intra_op_num_threads,
            to let onnxruntime choose by itself is required 0, default = 0
        :param inter_op_num_threads: ort_session_options.inter_op_num_threads,
            to let onnxruntime choose by itself is required 0, default = 0
        :param fp16_mode: use fp16_mode if class initializes only with
            onnx_path on GPU, default = False
        :type onnx_path: str
        :type trt_path: str
        :type gpu_device_id: int
        :type intra_op_num_threads: int
        :type inter_op_num_threads: int
        :type fp16_mode: bool
        """
        if gpu_device_id is None:
            import onnxruntime
    
            self.is_using_tensorrt = False
            ort_session_options = onnxruntime.SessionOptions()
            ort_session_options.intra_op_num_threads = intra_op_num_threads
            ort_session_options.inter_op_num_threads = inter_op_num_threads
            self.ort_session = onnxruntime.InferenceSession(
                onnx_path, ort_session_options
            )
    
        else:
            self.is_using_tensorrt = True
            if trt_path is None:
                builder = TRTEngineBuilder()
                trt_path = builder.build_engine(onnx_path, fp16_mode=fp16_mode)
    
            self.trt_session = TRTRunWrapper(trt_path)
    
  • TRTWrapper.run method assumes the use of such a structure:
    img = self._process_img_(img)
    if self.is_using_tensorrt:
        preds = self.trt_session.run(img)
    else:
        ort_inputs = {self.ort_session.get_inputs()[0].name: img}
        preds = self.ort_session.run(None, ort_inputs)
    
    preds = self._process_preds_(preds)
    

Environment

TensorRT

  • TensorRT installing guide is here
  • Required CUDA-Runtime, CUDA-ToolKit
  • Also, required additional python packages not included to setup.cfg (it depends upon CUDA environment version):
    • pycuda
    • nvidia-tensorrt
    • nvidia-pyindex

onnx_tensorrt

  • onnx_tensorrt requires cuda-runtime and tensorrt.
  • To install:
    git clone --depth 1 --branch 21.02 https://github.com/onnx/onnx-tensorrt.git
    cd onnx-tensorrt
    cp -r onnx_tensorrt /usr/local/lib/python3.8/dist-packages
    cd ..
    rm -rf onnx-tensorrt
    
Owner
Alex Gorodnitskiy
Computer Vision Engineer 🤖
Alex Gorodnitskiy
Dynamic vae - Dynamic VAE algorithm is used for anomaly detection of battery data

Dynamic VAE frame Automatic feature extraction can be achieved by probability di

10 Oct 07, 2022
CvT-ASSD: Convolutional vision-Transformerbased Attentive Single Shot MultiBox Detector (ICTAI 2021 CCF-C 会议)The 33rd IEEE International Conference on Tools with Artificial Intelligence

CvT-ASSD including extra CvT, CvT-SSD, VGG-ASSD models original-code-website: https://github.com/albert-jin/CvT-SSD new-code-website: https://github.c

金伟强 -上海大学人工智能小渣渣~ 5 Mar 07, 2022
JAX code for the paper "Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation"

Optimal Model Design for Reinforcement Learning This repository contains JAX code for the paper Control-Oriented Model-Based Reinforcement Learning wi

Evgenii Nikishin 43 Sep 28, 2022
Dashboard for the COVID19 spread

COVID-19 Data Explorer App A streamlit Dashboard for the COVID-19 spread. The app is live at: [https://covid19.cwerner.ai]. New data is queried from G

Christian Werner 22 Sep 29, 2022
StarGAN-ZSVC: Unofficial PyTorch Implementation

This repository is an unofficial PyTorch implementation of StarGAN-ZSVC by Matthew Baas and Herman Kamper. This repository provides both model architectures and the code to inference or train them.

Jirayu Burapacheep 11 Aug 28, 2022
Deep Face Recognition in PyTorch

Face Recognition in PyTorch By Alexey Gruzdev and Vladislav Sovrasov Introduction A repository for different experimental Face Recognition models such

Alexey Gruzdev 141 Sep 11, 2022
验证码识别 深度学习 tensorflow 神经网络

captcha_tf2 验证码识别 深度学习 tensorflow 神经网络 使用卷积神经网络,对字符,数字类型验证码进行识别,tensorflow使用2.0以上 目前项目还在更新中,诸多bug,欢迎提出issue和PR, 希望和你一起共同完善项目。 实例demo 训练过程 优化器选择: Adam

5 Apr 28, 2022
An all-in-one application to visualize multiple different local path planning algorithms

Table of Contents Table of Contents Local Planner Visualization Project (LPVP) Features Installation/Usage Local Planners Probabilistic Roadmap (PRM)

Abdur Javaid 47 Dec 30, 2022
Hand Gesture Volume Control | Open CV | Computer Vision

Gesture Volume Control Hand Gesture Volume Control | Open CV | Computer Vision Use gesture control to change the volume of a computer. First we look i

Jhenil Parihar 3 Jun 15, 2022
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data [WIP] Unofficial Pytorch implementation of AdaSpeech 2. Requirements : All code written i

Rishikesh (ऋषिकेश) 63 Dec 28, 2022
FTIR-Deep Learning - FTIR Deep Learning With Python

CANDIY-spectrum Human analyis of chemical spectra such as Mass Spectra (MS), Inf

Wei Mei 1 Jan 03, 2022
Pytorch implementation for "Open Compound Domain Adaptation" (CVPR 2020 ORAL)

Open Compound Domain Adaptation [Project] [Paper] [Demo] [Blog] Overview Open Compound Domain Adaptation (OCDA) is the author's re-implementation of t

Zhongqi Miao 137 Dec 15, 2022
A Collection of Papers and Codes for ICCV2021 Low Level Vision and Image Generation

A Collection of Papers and Codes for ICCV2021 Low Level Vision and Image Generation

196 Jan 05, 2023
Pytorch Implementation of Spiking Neural Networks Calibration, ICML 2021

SNN_Calibration Pytorch Implementation of Spiking Neural Networks Calibration, ICML 2021 Feature Comparison of SNN calibration: Features SNN Direct Tr

Yuhang Li 60 Dec 27, 2022
A collection of 100 Deep Learning images and visualizations

A collection of Deep Learning images and visualizations. The project has been developed by the AI Summer team and currently contains almost 100 images.

AI Summer 65 Sep 12, 2022
Detector for Log4Shell exploitation attempts

log4shell-detector Detector for Log4Shell exploitation attempts Idea The problem with the log4j CVE-2021-44228 exploitation is that the string can be

Florian Roth 729 Dec 25, 2022
A cross-lingual COVID-19 fake news dataset

CrossFake An English-Chinese COVID-19 fake&real news dataset from the ICDMW 2021 paper below: Cross-lingual COVID-19 Fake News Detection. Jiangshu Du,

Yingtong Dou 11 Dec 01, 2022
Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

105 Nov 07, 2022
Context Axial Reverse Attention Network for Small Medical Objects Segmentation

CaraNet: Context Axial Reverse Attention Network for Small Medical Objects Segmentation This repository contains the implementation of a novel attenti

401 Dec 23, 2022
Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation

SUO-SLAM This repository hosts the code for our CVPR 2022 paper "Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation". ArXiv li

Robot Perception & Navigation Group (RPNG) 97 Jan 03, 2023