EfficientNetv2 TensorRT int8

Overview

EfficientNetv2_TensorRT_int8

EfficientNetv2模型实现来自https://github.com/d-li14/efficientnetv2.pytorch

环境配置

ubuntu:18.04

cuda:11.0

cudnn:8.0

tensorrt:7.2.16

OpenCV:3.4.2

cuda,cudnn,tensorrt和OpenCV安装包可以从如下链接下载:

链接: https://pan.baidu.com/s/1XSzHJ1kPXO0PrAMAF6uNyA 密码: b88e

cuda安装

如果系统有安装驱动,运行如下命令卸载

sudo apt-get purge nvidia*

禁用nouveau,运行如下命令

sudo vim /etc/modprobe.d/blacklist.conf

在末尾添加

blacklist nouveau

然后执行

sudo update-initramfs -u

chmod +x cuda_11.0.2_450.51.05_linux.run

sudo ./cuda_11.0.2_450.51.05_linux.run

是否接受协议: accept

然后选择Install

最后回车

vim ~/.bashrc 添加如下内容:

export PATH=/usr/local/cuda-11.0/bin:$PATH

export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64:$LD_LIBRARY_PATH

source .bashrc 激活环境

cudnn 安装

tar -xzvf cudnn-11.0-linux-x64-v8.0.4.30.tgz

cd cuda/include

sudo cp *.h /usr/local/cuda-11.0/include

cd cuda/lib64

sudo cp libcudnn* /usr/local/cuda-11.0/lib64

tensorrt及OpenCV安装

定位到用户根目录

tar -xzvf TensorRT-7.2.1.6.Ubuntu-18.04.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz

cd TensorRT-7.2.1.6/python,该目录有4个python版本的tensorrt安装包

sudo pip3 install tensorrt-7.2.1.6-cp37-none-linux_x86_64.whl(根据自己的python版本安装)

pip install pycuda 安装python版本的cuda

定位到用户根目录

tar -xzvf opencv-3.4.2.zip 以备推理调用

efficientnetv2模型训练以及转换onnx

定位到用户根目录

git clone https://github.com/Wulingtian/EfficientNetv2_TensorRT_int8.git

cd EfficientNetv2_TensorRT_int8

vim train.py 修改IMAGENET_TRAINSET_SIZE参数 指定训练图片的数量

根据自己的训练数据及配置设置data(数据集路径),epochs,lr,batch-size等参数

python train.py,开始训练,模型保存在当前目录,名为model_best.pth.tar

vim export_onnx.py

设置weights_file(训练得到的模型),output_file(输出模型名称),img_size(图片输入大小),batch_size(推理的batch)

python export_onnx.py 得到onnx模型

onnx模型转换为 int8 tensorrt引擎

cd EfficientNetv2_TensorRT_int8/effnetv2_tensorrt_int8_tools

vim convert_trt_quant.py 修改如下参数

BATCH_SIZE 模型量化一次输入多少张图片

BATCH 模型量化次数

height width 输入图片宽和高

CALIB_IMG_DIR 量化图片路径(把训练的图片放到一个文件夹下,然后把这个文件夹设置为此参数,注意BATCH_SIZE*BATCH要小于或等于训练图片数量)

onnx_model_path onnx模型路径(上面运行export_onnx.py得到的onnx模型)

python convert_trt_quant.py 量化后的模型存到models_save目录下

TensorRT模型推理

cd EfficientNetv2_TensorRT_int8/effnetv2_tensorrt_int8

vim CMakeLists.txt

修改USER_DIR参数为自己的用户根目录

vim effnetv2_infer.cc修改如下参数

output_name effnetv2模型有1个输出

我们可以通过netron查看模型输出名

pip install netron 安装netron

vim netron_effnetv2.py 把如下内容粘贴

    import netron

    netron.start('此处填充简化后的onnx模型路径', port=3344)

python netron_effnetv2.py 即可查看 模型输出名

trt_model_path 量化的tensorrt推理引擎(models_save目录下trt后缀的文件)

test_img 测试图片路径

INPUT_W INPUT_H 输入图片宽高

NUM_CLASS 训练的模型有多少类

参数配置完毕

mkdir build

cd build

cmake ..

make

./Effnetv2sEngine 输出平均推理时间,实测1070显卡平均推理时间3.8ms一帧;至此,部署完成!

分享一下我的训练集(猫狗二分类数据)及量化数据,链接如下:

链接: https://pan.baidu.com/s/1Mh6GxTLoXRTCRQh-TPUc3Q 密码: 3dt3
Yoloxkeypointsegment - An anchor-free version of YOLO, with a simpler design but better performance

Introduction 关键点版本:已完成 全景分割版本:已完成 实例分割版本:已完成 YOLOX is an anchor-free version of

23 Oct 20, 2022
Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

merged_depth runs (1) AdaBins, (2) DiverseDepth, (3) MiDaS, (4) SGDepth, and (5) Monodepth2, and calculates a weighted-average per-pixel absolute dept

Pranav 39 Nov 21, 2022
A convolutional recurrent neural network for classifying A/B phases in EEG signals recorded for sleep analysis.

CAP-Classification-CRNN A deep learning model based on Inception modules paired with gated recurrent units (GRU) for the classification of CAP phases

Apurva R. Umredkar 2 Nov 25, 2022
TensorFlow 2 implementation of the Yahoo Open-NSFW model

TensorFlow 2 implementation of the Yahoo Open-NSFW model

Bosco Yung 101 Jan 01, 2023
The official implementation of paper Siamese Transformer Pyramid Networks for Real-Time UAV Tracking, accepted by WACV22

SiamTPN Introduction This is the official implementation of the SiamTPN (WACV2022). The tracker intergrates pyramid feature network and transformer in

Robotics and Intelligent Systems Control @ NYUAD 29 Jan 08, 2023
[CVPR 2022] Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

Using Unreliable Pseudo Labels Official PyTorch implementation of Semi-Supervised Semantic Segmentation Using Unreliable Pseudo Labels, CVPR 2022. Ple

Haochen Wang 268 Dec 24, 2022
Compute descriptors for 3D point cloud registration using a multi scale sparse voxel architecture

MS-SVConv : 3D Point Cloud Registration with Multi-Scale Architecture and Self-supervised Fine-tuning Compute features for 3D point cloud registration

42 Jul 25, 2022
Code release for Local Light Field Fusion at SIGGRAPH 2019

Local Light Field Fusion Project | Video | Paper Tensorflow implementation for novel view synthesis from sparse input images. Local Light Field Fusion

1.1k Dec 27, 2022
Ludwig is a toolbox that allows to train and evaluate deep learning models without the need to write code.

Translated in 🇰🇷 Korean/ Ludwig is a toolbox that allows users to train and test deep learning models without the need to write code. It is built on

Ludwig 8.7k Dec 31, 2022
Vector Quantized Diffusion Model for Text-to-Image Synthesis

Vector Quantized Diffusion Model for Text-to-Image Synthesis Due to company policy, I have to set microsoft/VQ-Diffusion to private for now, so I prov

Shuyang Gu 294 Jan 05, 2023
MG-GCN: Scalable Multi-GPU GCN Training Framework

MG-GCN MG-GCN: multi-GPU GCN training framework. For more information, please read our paper. After cloning our repository, run git submodule update -

Translational Data Analytics (TDA) Lab @GaTech 6 Oct 24, 2022
A code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Vanderhaeghe, and Yotam Gingold from SIGGRAPH Asia 2020.

A Benchmark for Rough Sketch Cleanup This is the code repository associated with the paper A Benchmark for Rough Sketch Cleanup by Chuan Yan, David Va

33 Dec 18, 2022
Fake-user-agent-traffic-geneator - Python CLI Tool to generate fake traffic against URLs with configurable user-agents

Fake traffic generator for Gartner Demo Generate fake traffic to URLs with custo

New Relic Experimental 3 Oct 31, 2022
Reproduction process of AlexNet

PaddlePaddle论文复现杂谈 背景 注:该repo基于PaddlePaddle,对AlexNet进行复现。时间仓促,难免有所疏漏,如果问题或者想法,欢迎随时提issue一块交流。 飞桨论文复现赛地址:https://aistudio.baidu.com/aistudio/competitio

19 Nov 29, 2022
FairMOT for Multi-Class MOT using YOLOX as Detector

FairMOT-X Project Overview FairMOT-X is a multi-class multi object tracker, which has been tailored for training on the BDD100K MOT Dataset. It makes

Jonathan Tan 33 Dec 28, 2022
Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [2021]

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations This repo contains the Pytorch implementation of our paper: Revisit

Wouter Van Gansbeke 80 Nov 20, 2022
[CVPR 2021] "The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models" Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Michael Carbin, Zhangyang Wang

The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models Codes for this paper The Lottery Tickets Hypo

VITA 59 Dec 28, 2022
dataset for ECCV 2020 "Motion Capture from Internet Videos"

Motion Capture from Internet Videos Motion Capture from Internet Videos Junting Dong*, Qing Shuai*, Yuanqing Zhang, Xian Liu, Xiaowei Zhou, Hujun Bao

ZJU3DV 98 Dec 07, 2022
Equivariant Imaging: Learning Beyond the Range Space

Equivariant Imaging: Learning Beyond the Range Space Equivariant Imaging: Learning Beyond the Range Space Dongdong Chen, Julián Tachella, Mike E. Davi

Dongdong Chen 46 Jan 01, 2023
Hypernetwork-Ensemble Learning of Segmentation Probability for Medical Image Segmentation with Ambiguous Labels

Hypernet-Ensemble Learning of Segmentation Probability for Medical Image Segmentation with Ambiguous Labels The implementation of Hypernet-Ensemble Le

Sungmin Hong 6 Jul 18, 2022