Pytorch Performace Tuning, WandB, AMP, Multi-GPU, TensorRT, Triton

Last update: Feb 25, 2022

Related tags

Overview

Plant Pathology 2020 FGVC7

Introduction

A deep learning model pipeline for training, experimentaiton and deployment for the Kaggle Competition, Plant Pathology 2020, utilising:

PyTorch: A Deep Learning Framework for high-performance AI research
Weights and Biases: tool for experiment tracking, dataset versioning, and model management
Apex: A Library to Accelerate Deep Learning Training using AMP, Fused Optimizer, and Multi-GPU
TensorRT: high-performance neural network inference optimizer and runtime engine for production deployment
Triton Inference Server: inference serving software that simplifies the deployment of AI models at scale
Streamlit: framework to quickly build highly interactive web applications for machine learning models

For a quick tutorial about all these modules, check out tutorials folder. Exploratory data analysis for the same can also be found in the notebooks folder.

Structure

├── app                 # Interactive Streamlit app scripts
├── data                # Datasets
├── examples            # assignment on pytorch amp and ddp
├── model               # Directory to save models for triton
├── notebooks           # EDA, Training, Model conversion, Inferencing and other utility notebooks
├── tutorials           # Tutorials on the modules used
└── requirements.txt    # Basic requirements

Usage

EDA: Data Evaluation

Data can be explored with various visualization techniques provided in eda.ipyb notebooks folder

Training the model

To run the pytorch resnet50 model use pytorch_train.ipynb.

The code is inspired by Pytorch Performance Tuning Guide

Once the model is trained, you can even run model explainabilty using the shap library. The tutorial notebook for the same can be found in the notebooks folder.

Model Conversion and Inferencing

Once you've trained the model, you will need to convert it to different formats in order to have a faster inference time as well as easily deploy them. You can convert the model to ONNX, TensorRT FP32 and TensorRT FP16 formats which are optimised to run faster inference. You will also need to convert the PyTorch model to TorchScript. Procedure for converting and benchmarking all the different formats of the model can be found in notebooks folder.

Model Deployment and Benchmarking

Now your models are ready to be deployed. For deployment, we utilise the Triton Inference Server. It provides an inferencing solution for deep learning models to be easily deployed and integrated with various functionalities. It supports HTTP and gRPC protocol that allows clients to request for inferencing, utilising any model of choice being managed by the server. The process of deployment can be found in Triton Inference Server.md.

Once your inferencing server is up and running, the next step it to understand as well as optimise the model performance. For this purpose, you can utilise tools like perf_analyzer which helps you measure changes in performance as you experiment with different parameters.

Interactive Web App

To run the Streamlit app:

cd app/
streamlit app.py

This will create a local server on which you can view the web application. This app contains the client side for the Triton Inference Server, along with an easy to use GUI.

Acknowledgement

This repository is built with references and code snippets from the NN Template by Luca Moschella.

Pytorch Performace Tuning, WandB, AMP, Multi-GPU, TensorRT, Triton

Related tags

Overview

Plant Pathology 2020 FGVC7

Introduction

Structure

Usage

EDA: Data Evaluation

Training the model

Model Conversion and Inferencing

Model Deployment and Benchmarking

Interactive Web App

Acknowledgement

Owner

Bharat Giddwani

Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

Ensembling Off-the-shelf Models for GAN Training

A keras-based real-time model for medical image segmentation (CFPNet-M)

Minimal implementation of PAWS (https://arxiv.org/abs/2104.13963) in TensorFlow.

[ACM MM 2021] Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Taking A Closer Look at Domain Shift: Category-level Adversaries for Semantics Consistent Domain Adaptation

Privacy-Preserving Machine Learning (PPML) Tutorial Presented at PyConDE 2022

Adversarial Graph Representation Adaptation for Cross-Domain Facial Expression Recognition (AGRA, ACM 2020, Oral)

Spline is a tool that is capable of running locally as well as part of well known pipelines like Jenkins (Jenkinsfile), Travis CI (.travis.yml) or similar ones.

Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

Code for paper: Towards Tokenized Human Dynamics Representation

Official Pytorch implementation of Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations

The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

LAVT: Language-Aware Vision Transformer for Referring Image Segmentation

Sparse Physics-based and Interpretable Neural Networks

一个免费开源一键搭建的通用验证码识别平台，大部分常见的中英数验证码识别都没啥问题。

AoT is a system for automatically generating off-target test harness by using build information.

[Link]mareteutral - pars tradg wth M []

MLPs for Vision and Langauge Modeling (Coming Soon)

Official PyTorch implementation of "Synthesis of Screentone Patterns of Manga Characters"