Object detection and instance segmentation toolkit based on PaddlePaddle.

Last update: Jan 02, 2023

Overview

简体中文 | English

PaddleDetection

PaddleDetection 2.0全面升级！目前默认使用动态图版本，静态图版本位于static中

简介

PaddleDetection飞桨目标检测开发套件，旨在帮助开发者更快更好地完成检测模型的组建、训练、优化及部署等全开发流程。

PaddleDetection模块化地实现了多种主流目标检测算法，提供了丰富的数据增强策略、网络模块组件（如骨干网络）、损失函数等，并集成了模型压缩和跨平台高性能部署能力。

经过长时间产业实践打磨，PaddleDetection已拥有顺畅、卓越的使用体验，被工业质检、遥感图像检测、无人巡检、新零售、互联网、科研等十多个行业的开发者广泛应用。

产品动态

2021.04.14: 发布release/2.0版本，PaddleDetection全面支持动态图，覆盖静态图模型算法，全面升级模型效果，同时发布PP-YOLO v2模型，新增旋转框检测S2ANet模型，详情参考PaddleDetection
2021.02.07: 发布release/2.0-rc版本，PaddleDetection动态图试用版本，详情参考PaddleDetection动态图。

特性

模型丰富: 包含目标检测、实例分割、人脸检测等100+个预训练模型，涵盖多种全球竞赛冠军方案
使用简洁：模块化设计，解耦各个网络组件，开发者轻松搭建、试用各种检测模型及优化策略，快速得到高性能、定制化的算法。
端到端打通: 从数据增强、组网、训练、压缩、部署端到端打通，并完备支持云端/边缘端多架构、多设备部署。
高性能: 基于飞桨的高性能内核，模型训练速度及显存占用优势明显。支持FP16训练, 支持多机训练。

套件结构概览

Architectures

Backbones

Components

Data Augmentation

Two-Stage Detection

Faster RCNN
FPN
Cascade-RCNN
Libra RCNN
Hybrid Task RCNN
PSS-Det

One-Stage Detection

RetinaNet
YOLOv3
YOLOv4
PP-YOLO
SSD

Anchor Free

CornerNet-Squeeze
FCOS
TTFNet

Instance Segmentation

Mask RCNN
SOLOv2

Face-Detction

FaceBoxes
BlazeFace
BlazeFace-NAS

ResNet(&vd)
ResNeXt(&vd)
SENet
Res2Net
HRNet
Hourglass
CBNet
GCNet
DarkNet
CSPDarkNet
VGG
MobileNetv1/v3
GhostNet
Efficientnet

Common

Sync-BN
Group Norm
DCNv2
Non-local

FPN

BiFPN
BFP
HRFPN
ACFPN

Loss

Smooth-L1
GIoU/DIoU/CIoU
IoUAware

Post-processing

SoftNMS
MatrixNMS

Speed

FP16 training
Multi-machine training

Resize
Flipping
Expand
Crop
Color Distort
Random Erasing
Mixup
Cutmix
Grid Mask
Auto Augment

模型性能概览

各模型结构和骨干网络的代表模型在COCO数据集上精度mAP和单卡Tesla V100上预测速度(FPS)对比图。

说明：

CBResNet为Cascade-Faster-RCNN-CBResNet200vd-FPN模型，COCO数据集mAP高达53.3%
Cascade-Faster-RCNN为Cascade-Faster-RCNN-ResNet50vd-DCN，PaddleDetection将其优化到COCO数据mAP为47.8%时推理速度为20FPS
PP-YOLO在COCO数据集精度45.9%，Tesla V100预测速度72.9FPS，精度速度均优于YOLOv4
PP-YOLO v2是对PP-YOLO模型的进一步优化，在COCO数据集精度49.5%，Tesla V100预测速度68.9FPS
图中模型均可在模型库中获取

文档教程

入门教程

进阶教程

模型库

通用目标检测:
通用实例分割
- SOLOv2
旋转框检测
- S2ANet
垂类领域
比赛冠军方案
- Objects365 2019 Challenge夺冠模型
- Open Images 2019-Object Detction比赛最佳单模型

应用案例

人像圣诞特效自动生成工具

第三方教程推荐

版本更新

v2.0版本已经在04/2021发布，全面支持动态图版本，新增支持BlazeFace, PSSDet等系列模型和大量骨干网络，发布PP-YOLO v2, PP-YOLO tiny和旋转框检测S2ANet模型。支持模型蒸馏、VisualDL，新增动态图预测部署benchmark，详细内容请参考版本更新文档。

许可证书

本项目的发布受Apache 2.0 license许可认证。

贡献代码

我们非常欢迎你可以为PaddleDetection提供代码，也十分感谢你的反馈。

引用

@misc{ppdet2019,
title={PaddleDetection, Object detection and instance segmentation toolkit based on PaddlePaddle.},
author={PaddlePaddle Authors},
howpublished = {\url{https://github.com/PaddlePaddle/PaddleDetection}},
year={2019}
}

Comments

🌟 PP-PicoDet已发布，欢迎大家试用&讨论
PP-PicoDet是轻量级实时移动端目标检测模型，我们提出了从小到大的一系列模型，包括S、M、L等，超越现有SOTA模型。

模型特色：

🌟精度高：1M参数量以内mAP(0.5:0.95)达到30.6，3.3M参数量mAP(0.5:0.95)达到40.9。

🚀速度快：在SD865上达到150FPS。

😊部署友好：我们支持PaddleInference/PaddleLite/MNN/NCNN/OpenVINO，并且提供C++/Python/Android demo。

链接：

详细算法细节请参考paper：https://arxiv.org/abs/2111.00902

Readme&配置文件： https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.3/configs/picodet

欢迎大家试用，有疑问欢迎讨论盖楼~

和其他模型对比：

FAQ汇总： （持续更新中）

版本要求： 训练导出模型要求Paddle版本统一，同时 PaddlePaddle >= 2.1.2。

学习率、GPU数和batch-size关系： 采用线性伸缩准则，发布的配置文件基本都是4卡GPU训练的，例如：变成单卡，请学习率除以4，如果batch size从80变成40，请学习率再除以2。

配置优先级： 一般picodet_x_coco.yml中的配置优先级高于__base__中配置，picodet_x_coco.yml中的所有设置会覆盖__base__中配置，所以修改picodet_x_coco.yml的配置即可。

在自己数据集上训练模型： 支持COCO和VOC两种数据格式，同时建议采用迁移学习加快收敛，具体步骤：从PicoDet的Readme中拷贝COCO上训好的pretrain weights链接，更新配置文件中pretrain_weights参数为COCO上训好的权重。

为了方便大家交流沟通，欢迎扫码添加微信群，继续交流有关PP-PicoDet的使用及建议~

status/close
opened by yghstill 124
C++部署paddledetectino 发现说明与现在最新版本对应不上。

https://paddledetection.readthedocs.io/advanced_tutorials/inference/docs/windows_vs2019_build.html 我根据上面的说明进行部署，发现报错。 CMakeCache.txt 指向 D:\1.6.1\paddle。此目录不存在。后来我打开下面网址https://paddledetection.readthedocs.io/advanced_tutorials/inference/docs/windows_vs2015_build.html 这个部署版本与paddleDetection 不同，还是需要1.6版本，我是应该部署1.6版本吗，可我的训练文件是0.2的paddleDetection。请问我应该如何解决。

opened by wyc880622 61
Bug of quant-aware training of tinypose ！
问题确认 Search before asking

[X] 我已经查询历史issue，没有报过同样bug。I have searched the issues and found no similar bug report.

bug描述 Describe the Bug

@yghstill TinyPose模型的自动化压缩代码整理完成了：https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/auto_compression/detection 配置文件是：configs/tinypose_qat_dis.yaml，启动方式和readme中完全一致，如果全量化的话可以先使用这个上面是你说的话，可是出现了几个问题。 1 你给出的这个链接 404打不开了，于是我找到了其他的位置，找到你说的yml，运行程序 2 python3 -m paddle.distributed.launch --log_dir=log0705 --gpus 0,1 run.py --config_path=./configs/tinypose_qat_dis.yaml --save_dir='./output0705/'
python3 run.py --config_path=./configs/tinypose_qat_dis.yaml --save_dir='./output0705/'
以上两种方式都尝试了，可是报错如下请大佬能认真帮忙看下吗？ 3 是否有真正的可复现而且无bug的版本发出来，是否可以进行自测再发出来，多谢

BR

复现环境 Environment

paddle-gpu 2.23 以上 paddledet release/2.4 paddleslim develop cuda 11.3 pytorch 跟cuda配套

是否愿意提交PR Are you willing to submit a PR?

[x] Yes I'd like to help by submitting a PR!

status/close
opened by 2050airobert 33

[BUG]PPYOLOE训练问题

训练yoloe_s报错,提示：

ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of 
X = [1, 3024, 2] and the shape of Y = [8400, 2]. Received [3024] in X is not equal to [8400] in Y at i:1.
  [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, 
but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] 
(at /paddle/paddle/fluid/operators/elementwise/elementwise_op_function.h:240)
  [operator < elementwise_add > error]

opened by m00nLi 30

训练时报错：has no im_shape field

配置文件在 mask_rcnn_r50_2x.yml 的基础上进行修改，执行下面的命令之后：

!python tools/train.py -c configs/myconfig/mask_rcnn_r50_2x.yml --eval -o use_gpu=true --use_vdl=True --vdl_log_dir=vdl_dir/scalar

报错：

Traceback (most recent call last):
  File "tools/train.py", line 377, in <module>
    main()
  File "tools/train.py", line 146, in main
    fetches = model.eval(feed_vars)
  File "/home/aistudio/work/PaddleDetection/ppdet/modeling/architectures/mask_rcnn.py", line 338, in eval
    return self.build(feed_vars, 'test')
  File "/home/aistudio/work/PaddleDetection/ppdet/modeling/architectures/mask_rcnn.py", line 81, in build
    self._input_check(required_fields, feed_vars)
  File "/home/aistudio/work/PaddleDetection/ppdet/modeling/architectures/mask_rcnn.py", line 271, in _input_check
    "{} has no {} field".format(feed_vars, var)
AssertionError: OrderedDict([('image', name: "image"
type {
  type: LOD_TENSOR
  lod_tensor {
    tensor {
      data_type: FP32
      dims: -1
      dims: 3
      dims: -1
      dims: -1
    }
    lod_level: 0
  }
}
persistable: false
need_check_feed: true
), ('im_info', name: "im_info"
type {
  type: LOD_TENSOR
  lod_tensor {
    tensor {
      data_type: FP32
      dims: -1
      dims: 3
    }
    lod_level: 0
  }
}
persistable: false
need_check_feed: true
), ('im_id', name: "im_id"
type {
  type: LOD_TENSOR
  lod_tensor {
    tensor {
      data_type: INT64
      dims: -1
      dims: 1
    }
    lod_level: 0
  }
}
persistable: false
need_check_feed: true
), ('gt_bbox', name: "gt_bbox"
type {
  type: LOD_TENSOR
  lod_tensor {
    tensor {
      data_type: FP32
      dims: -1
      dims: 4
    }
    lod_level: 1
  }
}
persistable: false
need_check_feed: true
), ('gt_class', name: "gt_class"
type {
  type: LOD_TENSOR
  lod_tensor {
    tensor {
      data_type: INT32
      dims: -1
      dims: 1
    }
    lod_level: 1
  }
}
persistable: false
need_check_feed: true
), ('is_crowd', name: "is_crowd"
type {
  type: LOD_TENSOR
  lod_tensor {
    tensor {
      data_type: INT32
      dims: -1
      dims: 1
    }
    lod_level: 1
  }
}
persistable: false
need_check_feed: true
), ('gt_mask', name: "gt_mask"
type {
  type: LOD_TENSOR
  lod_tensor {
    tensor {
      data_type: FP32
      dims: -1
      dims: 2
    }
    lod_level: 3
  }
}
persistable: false
need_check_feed: true
)]) has no im_shape field

我查看了 mask_rcnn.py 第81 行，这里应该是因为 model != 'train' 所以 self._input_check(required_fields, feed_vars) check 了 im_shape 我的训练启动命令有问题吗？

opened by iceriver97 29

S2Anet本地部署
问题确认 Search before asking

[X] 我已经搜索过问题，但是没有找到解答。I have searched the question and found no related answer.

请提出你的问题 Please ask your question

https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.4/configs/dota 请问这个链接的镜像是不是没了。我显卡3070 cuda11.1 paddle2.2 paddledetection 2.2 按照教程训练显示cudnn错误。会不会是目前这个只支持cuda10 cudnn7.可30显卡好像对cuda10不太友好。请问有没有办法能本地安装上这个模型阿
status/close
opened by jianxin123 28

关键点检测模型评估的时候出现DataLoader reader thread raised an exception

ERROR - DataLoader reader thread raised an exception!                                 
Exception in thread Thread-1:                                                                                   
Traceback (most recent call last):                                                                              
  File "/usr/local/python3/lib/python3.6/threading.py", line 916, in _bootstrap_inner                           
    self.run()                                                                                                  
  File "/usr/local/python3/lib/python3.6/threading.py", line 864, in run                                        
    self._target(*self._args, **self._kwargs)                                                                   
  File "/home/.local/lib/python3.6/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 662, 
in _thread_loop                                                                                                 
    six.reraise(*sys.exc_info())                                                                                
  File "/home/.local/lib/python3.6/site-packages/six.py", line 719, in reraise                          
    raise value                                                                                                 
  File "/home/.local/lib/python3.6/site-packages/paddle/fluid/dataloader/dataloader_iter.py", line 650, 
in _thread_loop                                                                                                 
    tmp.set(slot, core.CPUPlace())                                                                              
ValueError: (InvalidArgument) Input object type error or incompatible array data type. tensor.set() supports arr
ay with bool, float16, float32, float64, int8, int16, int32, int64, uint8 or uint16, please check your input or 
input array data type. (at /paddle/paddle/fluid/pybind/tensor_py.h:355)

这里应该填入的是什么数据啊

help wanted

opened by huangdf97 26

ValueError: Target 460 is out of upper bound.

问题确认 Search before asking

[X] 我已经搜索过问题，但是没有找到解答。I have searched the question and found no related answer.

请提出你的问题 Please ask your question

ppyoloe_crn_s_300e_coco VOC 数据集

python tools/train.py -c configs/ppyoloe/ppyoloe_crn_s_300e_coco.yml


W0823 14:30:26.446256  4452 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.6, Runtime API Version: 11.2
W0823 14:30:26.461884  4452 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
[08/23 14:30:27] ppdet.utils.checkpoint INFO: Finish loading model weights: C:\Users\fujunnnn/.cache/paddle/weights\CSPResNetb_s_pretrained.pdparams
[08/23 14:30:30] ppdet.engine INFO: Epoch: [0] [  0/339] learning_rate: 0.000000 loss: 1931307253760.000000 loss_cls: 0.594841 loss_iou: 772522901504.000000 loss_dfl: 5885.125977 loss_l1: 0.105123 eta: 4 days, 9:30:32 batch_cost: 3.7348 data_cost: 0.2500 ips: 2.6775 images/s
Traceback (most recent call last):
  File "tools/train.py", line 177, in <module>
    main()
  File "tools/train.py", line 173, in main
    run(FLAGS, cfg)
  File "tools/train.py", line 127, in run
    trainer.train(FLAGS.eval)
  File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\engine\trainer.py", line 454, in train
    outputs = model(data)
  File "E:\anaconda3\envs\PaddleDetection\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in __call__
    return self._dygraph_call_func(*inputs, **kwargs)
  File "E:\anaconda3\envs\PaddleDetection\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func
    outputs = self.forward(*inputs, **kwargs)
  File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\architectures\meta_arch.py", line 59, in forward
    out = self.get_loss()
  File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\architectures\yolo.py", line 125, in get_loss
    return self._forward()
  File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\architectures\yolo.py", line 88, in _forward
    yolo_losses = self.yolo_head(neck_feats, self.inputs)
  File "E:\anaconda3\envs\PaddleDetection\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in __call__
    return self._dygraph_call_func(*inputs, **kwargs)
  File "E:\anaconda3\envs\PaddleDetection\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func
    outputs = self.forward(*inputs, **kwargs)
  File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 217, in forward
    return self.forward_train(feats, targets)
  File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 160, in forward_train
    ], targets)
  File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 355, in get_loss
    assigned_scores_sum)
  File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 291, in _bbox_loss
    assigned_ltrb_pos) * bbox_weight
  File "E:\PaddleX_GUI_2.1.0_win10\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 256, in _df_loss
    pred_dist, target_left, reduction='none') * weight_left
  File "E:\anaconda3\envs\PaddleDetection\lib\site-packages\paddle\nn\functional\loss.py", line 1723, in cross_entropy
    label_max.item()))
ValueError: Target 25479 is out of upper bound.

python tools/train.py -c configs/ppyoloe/ppyoloe_plus_crn_s_80e_coco.yml


W0823 21:31:38.730271 10200 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 11.6, Runtime API Version: 11.2
W0823 21:31:38.750262 10200 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
[08/23 21:31:40] ppdet.utils.checkpoint INFO: The shape [365] in pretrained weight yolo_head.pred_cls.0.bias is unmatched with the shape [4] in model yolo_head.pred_cls.0.bias. And the weight yolo_head.pred_cls.0.bias will not be loaded
[08/23 21:31:40] ppdet.utils.checkpoint INFO: The shape [365, 384, 3, 3] in pretrained weight yolo_head.pred_cls.0.weight is unmatched with the shape [4, 384, 3, 3] in model yolo_head.pred_cls.0.weight. And the weight yolo_head.pred_cls.0.weight will not be loaded
[08/23 21:31:40] ppdet.utils.checkpoint INFO: The shape [365] in pretrained weight yolo_head.pred_cls.1.bias is unmatched with the shape [4] in model yolo_head.pred_cls.1.bias. And the weight yolo_head.pred_cls.1.bias will not be loaded
[08/23 21:31:40] ppdet.utils.checkpoint INFO: The shape [365, 192, 3, 3] in pretrained weight yolo_head.pred_cls.1.weight is unmatched with the shape [4, 192, 3, 3] in model yolo_head.pred_cls.1.weight. And the weight yolo_head.pred_cls.1.weight will not be loaded
[08/23 21:31:40] ppdet.utils.checkpoint INFO: The shape [365] in pretrained weight yolo_head.pred_cls.2.bias is unmatched with the shape [4] in model yolo_head.pred_cls.2.bias. And the weight yolo_head.pred_cls.2.bias will not be loaded
[08/23 21:31:40] ppdet.utils.checkpoint INFO: The shape [365, 96, 3, 3] in pretrained weight yolo_head.pred_cls.2.weight is unmatched with the shape [4, 96, 3, 3] in model yolo_head.pred_cls.2.weight. And the weight yolo_head.pred_cls.2.weight will not be loaded
[08/23 21:31:40] ppdet.utils.checkpoint INFO: Finish loading model weights: C:\Users\MM/.cache/paddle/weights\ppyoloe_crn_s_obj365_pretrained.pdparams
Traceback (most recent call last):
  File "tools/train.py", line 172, in <module>
    main()
  File "tools/train.py", line 168, in main
    run(FLAGS, cfg)
  File "tools/train.py", line 132, in run
    trainer.train(FLAGS.eval)
  File "D:\0SDXX\PaddleDetection\ppdet\engine\trainer.py", line 504, in train
    outputs = model(data)
  File "D:\Anaconda3\envs\PaddleSeg\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in __call__
    return self._dygraph_call_func(*inputs, **kwargs)
  File "D:\Anaconda3\envs\PaddleSeg\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func
    outputs = self.forward(*inputs, **kwargs)
  File "D:\0SDXX\PaddleDetection\ppdet\modeling\architectures\meta_arch.py", line 59, in forward
    out = self.get_loss()
  File "D:\0SDXX\PaddleDetection\ppdet\modeling\architectures\yolo.py", line 124, in get_loss
    return self._forward()
  File "D:\0SDXX\PaddleDetection\ppdet\modeling\architectures\yolo.py", line 88, in _forward
    yolo_losses = self.yolo_head(neck_feats, self.inputs)
  File "D:\Anaconda3\envs\PaddleSeg\lib\site-packages\paddle\fluid\dygraph\layers.py", line 930, in __call__
    return self._dygraph_call_func(*inputs, **kwargs)
  File "D:\Anaconda3\envs\PaddleSeg\lib\site-packages\paddle\fluid\dygraph\layers.py", line 915, in _dygraph_call_func
    outputs = self.forward(*inputs, **kwargs)
  File "D:\0SDXX\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 216, in forward
    return self.forward_train(feats, targets)
  File "D:\0SDXX\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 161, in forward_train
    ], targets)
  File "D:\0SDXX\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 354, in get_loss
    assigned_scores_sum)
  File "D:\0SDXX\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 290, in _bbox_loss
    assigned_ltrb_pos) * bbox_weight
  File "D:\0SDXX\PaddleDetection\ppdet\modeling\heads\ppyoloe_head.py", line 255, in _df_loss
    pred_dist, target_left, reduction='none') * weight_left
  File "D:\Anaconda3\envs\PaddleSeg\lib\site-packages\paddle\nn\functional\loss.py", line 1723, in cross_entropy
    label_max.item()))
ValueError: Target 28 is out of upper bound.

windows status/close

opened by monkeycc 25

只训练一轮epoch就结束了

根据教程，输入 python tools/train.py -c configs/yolov3/yolov3_mobilenet_v1_roadsign.yml ，只训练一轮就结束了，没有任何报错，环境python3.7，paddledetection是clone的2.1-gpu版本，epoch是默认的12。

opened by lizhenhanabc 25

PPYOLOFPN结构修改

我在PaddleDetection/ppdet/modeling/necks/yolo_fpn.py 做了如下修改：【在forward中加入 block = ChannelAttention(block)； block = SpatialAttention()】

class PPYOLOFPN(nn.Layer):
    __shared__ = ['norm_type', 'data_format']

    def __init__(self,
                 in_channels=[512, 1024, 2048],
                 norm_type='bn',
                 data_format='NCHW',
                 coord_conv=False,
                 conv_block_num=2,
                 drop_block=False,
                 block_size=3,
                 keep_prob=0.9,
                 spp=False):
        """
        PPYOLOFPN layer

        Args:
            in_channels (list): input channels for fpn
            norm_type (str): batch norm type, default bn
            data_format (str): data format, NCHW or NHWC
            coord_conv (bool): whether use CoordConv or not
            conv_block_num (int): conv block num of each pan block
            drop_block (bool): whether use DropBlock or not
            block_size (int): block size of DropBlock
            keep_prob (float): keep probability of DropBlock
            spp (bool): whether use spp or not

        """
        super(PPYOLOFPN, self).__init__()
        assert len(in_channels) > 0, "in_channels length should > 0"
        self.in_channels = in_channels
        self.num_blocks = len(in_channels)
        # parse kwargs
        self.coord_conv = coord_conv
        self.drop_block = drop_block
        self.block_size = block_size
        self.keep_prob = keep_prob
        self.spp = spp
        self.conv_block_num = conv_block_num
        self.data_format = data_format
        if self.coord_conv:
            ConvLayer = CoordConv
        else:
            ConvLayer = ConvBNLayer

        if self.drop_block:
            dropblock_cfg = [[
                'dropblock', DropBlock, [self.block_size, self.keep_prob],
                dict()
            ]]
        else:
            dropblock_cfg = []

        self._out_channels = []
        self.yolo_blocks = []
        self.routes = []
        for i, ch_in in enumerate(self.in_channels[::-1]):
            if i > 0:
                ch_in += 512 // (2**i)
            channel = 64 * (2**self.num_blocks) // (2**i)
            base_cfg = []
            c_in, c_out = ch_in, channel
            for j in range(self.conv_block_num):
                base_cfg += [
                    [
                        'conv{}'.format(2 * j), ConvLayer, [c_in, c_out, 1],
                        dict(
                            padding=0, norm_type=norm_type)
                    ],
                    [
                        'conv{}'.format(2 * j + 1), ConvBNLayer,
                        [c_out, c_out * 2, 3], dict(
                            padding=1, norm_type=norm_type)
                    ],
                ]
                c_in, c_out = c_out * 2, c_out

            base_cfg += [[
                'route', ConvLayer, [c_in, c_out, 1], dict(
                    padding=0, norm_type=norm_type)
            ], [
                'tip', ConvLayer, [c_out, c_out * 2, 3], dict(
                    padding=1, norm_type=norm_type)
            ]]

            if self.conv_block_num == 2:
                if i == 0:
                    if self.spp:
                        spp_cfg = [[
                            'spp', SPP, [channel * 4, channel, 1], dict(
                                pool_size=[5, 9, 13], norm_type=norm_type)
                        ]]
                    else:
                        spp_cfg = []
                    cfg = base_cfg[0:3] + spp_cfg + base_cfg[
                        3:4] + dropblock_cfg + base_cfg[4:6]
                else:
                    cfg = base_cfg[0:2] + dropblock_cfg + base_cfg[2:6]
            elif self.conv_block_num == 0:
                if self.spp and i == 0:
                    spp_cfg = [[
                        'spp', SPP, [c_in * 4, c_in, 1], dict(
                            pool_size=[5, 9, 13], norm_type=norm_type)
                    ]]
                else:
                    spp_cfg = []
                cfg = spp_cfg + dropblock_cfg + base_cfg
            name = 'yolo_block.{}'.format(i)
            yolo_block = self.add_sublayer(name, PPYOLODetBlock(cfg, name))
            self.yolo_blocks.append(yolo_block)
            self._out_channels.append(channel * 2)
            if i < self.num_blocks - 1:
                name = 'yolo_transition.{}'.format(i)
                route = self.add_sublayer(
                    name,
                    ConvBNLayer(
                        ch_in=channel,
                        ch_out=256 // (2**i),
                        filter_size=1,
                        stride=1,
                        padding=0,
                        norm_type=norm_type,
                        data_format=data_format,
                        name=name))
                self.routes.append(route)

    def forward(self, blocks):
        assert len(blocks) == self.num_blocks
        blocks = blocks[::-1]
        yolo_feats = []
        for i, block in enumerate(blocks):
            if i > 0:
                if self.data_format == 'NCHW':
                    logger.info("进入ChannelA")
                    block = ChannelAttention(block)
                    block = SpatialAttention()
                    block = paddle.concat([route, block], axis=1)
                else:
                    block = ChannelAttention(block)
                    block = SpatialAttention()
                    block = paddle.concat([route, block], axis=-1)
            route, tip = self.yolo_blocks[i](block)
            yolo_feats.append(tip)

            if i < self.num_blocks - 1:
                route = self.routes[i](route)
                route = F.interpolate(
                    route, scale_factor=2., data_format=self.data_format)

        return yolo_feats

    @classmethod
    def from_config(cls, cfg, input_shape):
        return {'in_channels': [i.channels for i in input_shape], }

    @property
    def out_shape(self):
        return [ShapeSpec(channels=c) for c in self._out_channels]

class ChannelAttention(nn.Layer):
    def __init__(self, in_planes, ratio=16):
        super(ChannelAttention, self).__init__()
        self.avg_pool = nn.AdaptiveAvgPool2D(1)
        self.max_pool = nn.AdaptiveAvgPool2D(1)

        self.fc1   = nn.Conv2D(in_planes, in_planes // 16, 1, bias=False)
        self.relu1 = F.relu()
        self.fc2   = nn.Conv2D(in_planes // 16, in_planes, 1, bias=False)

        self.sigmoid = F.sigmoid()

    def forward(self, x):
        logger.info("进入ChannelAttention")
        
        avg_out = self.fc2(self.relu1(self.fc1(self.avg_pool(x))))
        max_out = self.fc2(self.relu1(self.fc1(self.max_pool(x))))
        out = avg_out + max_out
        return self.sigmoid(out)

class SpatialAttention(nn.Layer):
    def __init__(self, kernel_size=7):
        super(SpatialAttention, self).__init__()
        logger.info("进入SpatialAttention")

        assert kernel_size in (3, 7), 'kernel size must be 3 or 7'
        padding = 3 if kernel_size == 7 else 1

        self.conv1 = nn.Conv2D(2, 1, kernel_size, padding=padding, bias=False)
        self.sigmoid = F.sigmoid()

    def forward(self, x):
        avg_out = paddle.mean(x, dim=1, keepdim=True)
        max_out, _ = paddle.max(x, dim=1, keepdim=True)
        x = paddle.concat([avg_out, max_out], dim=1)
        x = self.conv1(x)
        return self.sigmoid(x)

enhancement

opened by zsbjmy 25

jetson nano 部署路标检测模型：GPU显示没有目标，CPU则正确显示目标。

paddlepaddle - gpu 版本2.0.0 PaddleDetection版本release 2.0 rc 硬件：jetson nano jetpack4.3 场景： 1、使用tools/train.py 对路标检测模型（yolov3_mobilenet_v1_roadsign.yml）进行训练，得到best_model。 2、使用tools/export_model.py进行导出模型。 python tools/export_model.py -c configs/yolov3_mobilenet_v1_roadsign.yml
--output_dir=./export_model
-o weights=./output/yolov3_mobilenet_v1_roadsign/best_model TestReader,input_def.image_shape=[3,320,320] 3、使用deploy/python/infer.py在jetson nano下进行部署。（1）使用gpu预测： (2)使用cpu预测正常显示。

麻烦跟进一下，后续需要提供什么信息请告知我就行，谢谢！
deploy

opened by GZHUZhao 24
VisDrone-DET 检测模型检测类别说明有误

文档链接&描述 Document Links & Description

链接：https://github.com/PaddlePaddle/PaddleDetection/tree/release/2.5/configs/visdrone 第一段存在问题：PaddleDetection团队提供了针对VisDrone-DET小目标数航拍场景的基于PP-YOLOE的检测模型，用户可以下载模型进行使用。整理后的COCO格式VisDrone-DET数据集下载链接，检测其中的10类，包括 pedestrian(1), people(2), bicycle(3), car(4), van(5), truck(6), tricycle(7), awning-tricycle(8), bus(9), motor(10)，原始数据集下载链接。问题：这一部分“检测其中的10类，包括 pedestrian(1), people(2), bicycle(3), car(4), van(5), truck(6), tricycle(7), awning-tricycle(8), bus(9), motor(10)。”类别名和序号应该存在问题。发现问题步骤： 1、执行如下命令python tools/infer.py -c configs/smalldet/ppyoloe_crn_l_80e_sliced_visdrone_640_025.yml --infer_img=demo/000000014439.jpg -o use_gpu=False weights=https://paddledet.bj.bcebos.com/models/ppyoloe_crn_l_80e_sliced_visdrone_640_025.pdparams --infer_img=demo/000000014439.jpg --draw_threshold=0.1 2、output目录生成的文件打印的类别中包括traffic_light。

请提出你的建议 Please give your suggestion

从结果来看应该是有错的，暂时我也没找到正确的类别。

opened by tianmaxingkong168 0
为什么使用多进程以后，时间会更慢了？
问题确认 Search before asking

[X] 我已经搜索过问题，但是没有找到解答。I have searched the question and found no related answer.

请提出你的问题 Please ask your question

为什么使用多进程以后，时间会更慢了？本来行人检测时间也就6ms左右，用了如下方式启用多进程以后，该行人检测时间竟然达到了48ms以上，这是为什么？ read_frame函数在做读帧，并做行人检测处理。
opened by qq-tt 2
paddledetection 中车牌识别因文字影响无法识别车牌的问题
问题确认 Search before asking

[X] 我已经搜索过问题，但是没有找到解答。I have searched the question and found no related answer.

请提出你的问题 Please ask your question

当用paddledetection中的车牌检测时。车头有文字的时候，没有检测到车牌，而是检测到文字上了，想问问能怎么解决!
opened by PengKunGit 2
多进程报错
问题确认 Search before asking

[X] 我已经查询历史issue，没有发现相似的bug。I have searched the issues and found no similar bug report.

Bug组件 Bug Component

No response

Bug描述 Describe the Bug

多进程代码写法是参考网上的，能跑得通，但是用在飞桨这里就报错报错如下：

复现环境 Environment

linux:PadlepadleDetection

Bug描述确认 Bug description confirmation

[X] 我确认已经提供了Bug复现步骤、代码改动说明、以及环境信息，确认问题是可以复现的。I confirm that the bug replication steps, code change instructions, and environment information have been provided, and the problem can be reproduced.

是否愿意提交PR？ Are you willing to submit a PR?

[X] 我愿意提交PR！I'd like to help by submitting a PR!
opened by qq-tt 2

Releases(v2.5.0)

v2.5.0(Sep 13, 2022)
2.5(08.26/2022)

特色模型

PP-YOLOE+：

发布PP-YOLOE+模型，COCO test2017数据集精度提升0.7%-2.4% mAP，模型训练收敛速度提升3.75倍，端到端预测速度提升1.73-2.3倍

发布智慧农业，夜间安防检测，工业质检场景预训练模型，精度提升1.3%-8.1% mAP

支持分布式训练、在线量化、serving部署等10大高性能训练部署能力，新增C++/Python Serving、TRT原生推理、ONNX Runtime等5+部署demo教程

PP-PicoDet：

发布PicoDet-NPU模型，支持模型全量化部署

新增PicoDet版面分析模型，基于FGD蒸馏算法精度提升0.5% mAP

PP-TinyPose

发布PP-TinyPose增强版，在健身、舞蹈等场景的业务数据集端到端AP提升9.1% AP

覆盖侧身、卧躺、跳跃、高抬腿等非常规动作

新增滤波稳定模块，关键点稳定性显著增强

场景能力

PP-Human v2

发布PP-Human v2，支持四大产业特色功能：多方案行为识别案例库、人体属性识别、人流检测与轨迹留存以及高精度跨镜跟踪

底层算法能力升级，行人检测精度提升1.5% mAP；行人跟踪精度提升10.2% MOTA，轻量级模型速度提升34%；属性识别精度提升0.6% ma，轻量级模型速度提升62.5%

提供全流程教程，覆盖数据采集标注，模型训练优化和预测部署，及pipeline中后处理代码修改

新增在线视频流输入支持

易用性提升，一行代码执行功能，执行流程判断、模型下载背后自动完成。

PP-Vehicle

全新发布PP-Vehicle，支持四大交通场景核心功能：车牌识别、属性识别、车流量统计、违章检测

车牌识别支持基于PP-OCR v3的轻量级车牌识别模型

车辆属性识别支持基于PP-LCNet多标签分类模型

兼容图片、视频、在线视频流等各类数据输入格式

易用性提升，一行代码执行功能，执行流程判断、模型下载背后自动完成。

前沿算法

YOLO家族全系列模型

发布YOLO家族全系列模型，覆盖前沿检测算法YOLOv5、MT-YOLOv6及YOLOv7

基于ConvNext骨干网络，YOLO各算法训练周期缩5-8倍，精度普遍提升1%-5% mAP；使用模型压缩策略实现精度无损的同时速度提升30%以上

新增基于ViT骨干网络高精度检测模型，COCO数据集精度达到55.7% mAP

新增OC-SORT多目标跟踪模型

新增ConvNeXt骨干网络

产业实践范例教程

基于PP-TinyPose增强版的智能健身动作识别

基于PP-Human的打架识别

基于PP-Human的营业厅来客分析

基于PP-Vehicle的车辆结构化分析

基于PP-YOLOE+的PCB电路板缺陷检测

框架能力

功能新增

新增自动压缩工具支持并提供demo，PP-YOLOE l版本精度损失0.3% mAP，V100速度提升13%

新增PaddleServing python/C++和ONNXRuntime部署demo

新增PP-YOLOE 端到端TensorRT部署demo

新增FGC蒸馏算法，RetinaNet精度提升3.3%

新增分布式训练文档

功能完善/Bug修复

修复Windows c++部署编译问题

修复VOC格式数据预测时保存结果问题

修复FairMOT c++部署检测框输出

旋转框检测模型S2ANet支持batch size>1部署

Source code(tar.gz)
Source code(zip)
v2.4.0(Apr 24, 2022)
2.4(03.24/2022)

PP-YOLOE：

发布PP-YOLOE特色模型，l版本COCO test2017数据集精度51.6%，V100预测速度78.1 FPS，精度速度服务器端SOTA

发布s/m/l/x系列模型，打通TensorRT、ONNX部署能力

支持混合精度训练，训练较PP-YOLOv2加速33%

PP-PicoDet:

发布PP-PicoDet优化模型，精度提升2%左右，CPU预测速度提升63%。

新增参数量0.7M的PicoDet-XS模型

后处理集成到网络中，优化端到端部署成本

行人分析Pipeline：

发布PP-Human行人分析Pipeline，覆盖行人检测、属性识别、行人跟踪、跨镜跟踪、人流量统计、动作识别多种功能，打通TensorRT部署

属性识别支持StrongBaseline模型

ReID支持Centroid模型

动作识别支持ST-GCN摔倒检测

模型丰富度:

发布YOLOX，支持nano/tiny/s/m/l/x版本，x版本COCO val2017数据集精度51.8%

框架功能优化：

EMA训练速度优化20%，优化EMA训练模型保存方式

支持infer预测结果保存为COCO格式

部署优化：

RCNN全系列模型支持Paddle2ONNX导出ONNX模型

SSD模型支持导出时融合解码OP，优化边缘端部署速度

支持NMS导出TensorRT，TensorRT部署端到端速度提升

Source code(tar.gz)
Source code(zip)
v2.3.0(Dec 9, 2021)
检测: 轻量级移动端检测模型PP-PicoDet，精度速度达到移动端SOTA

关键点: 轻量级移动端关键点模型PP-TinyPose

模型丰富度:

检测：

新增Swin-Transformer目标检测模型

新增TOOD(Task-aligned One-stage Object Detection)模型

新增GFL(Generalized Focal Loss)目标检测模型

发布Sniper小目标检测优化方法，支持Faster RCNN及PP-YOLO系列模型

发布针对EdgeBoard优化的PP-YOLO-EB模型

跟踪

发布实时跟踪系统PP-Tracking

发布FairMot高精度模型、小尺度模型和轻量级模型

发布行人、人头和车辆实跟踪垂类模型库，覆盖航拍监控、自动驾驶、密集人群、极小目标等场景

DeepSORT模型适配PP-YOLO, PP-PicoDet等更多检测器

关键点

新增Lite HRNet模型

预测部署:

YOLOv3系列模型支持NPU预测部署

FairMot模型C++预测部署打通

关键点系列模型C++预测部署打通, Paddle Lite预测部署打通

文档:

新增各系列模型英文文档

Source code(tar.gz)
Source code(zip)
v2.2.0(Aug 17, 2021)
模型丰富度：

发布Transformer检测模型：DETR、Deformable DETR、Sparse RCNN

关键点检测新增Dark模型，发布Dark HRNet模型

发布MPII数据集HRNet关键点检测模型

发布人头、车辆跟踪垂类模型

模型优化：

旋转框检测模型S2ANet发布Align Conv优化模型，DOTA数据集mAP优化至74.0

预测部署

主流模型支持batch size>1预测部署，包含YOLOv3，PP-YOLO，Faster RCNN，SSD，TTFNet，FCOS

新增多目标跟踪模型(JDE, FairMot, DeepSort) Python端预测部署支持，并支持TensorRT预测

新增多目标跟踪模型FairMot联合关键点检测模型部署Python端预测部署支持

新增关键点检测模型联合PP-YOLO预测部署支持

文档：

Windows预测部署文档新增TensorRT版本说明

FAQ文档更新发布

问题修复：

修复PP-YOLO系列模型训练收敛性问题

修复batch size>1时无标签数据训练问题

Source code(tar.gz)
Source code(zip)
v2.1.0(May 20, 2021)
模型丰富度提升：

发布关键点模型HRNet，HigherHRNet

发布多目标跟踪模型DeepSort, FairMot, JDE

框架基础能力：

支持无标注框训练

预测部署：

Paddle Inference YOLOv3系列模型支持batch size>1预测

旋转框检测S2ANet模型预测部署打通

增加量化模型Benchmark

增加动态图模型与静态图模型Paddle-Lite demo

检测模型压缩：

发布PPYOLO系列模型压缩模型

文档：

更新快速开始，预测部署等教程文档

新增ONNX模型导出教程

新增移动端部署文档

Source code(tar.gz)
Source code(zip)
v2.0.0(Apr 19, 2021)
2.0(04.15/2021)

说明： 自2.0版本开始，动态图作为PaddleDetection默认版本，原dygraph目录切换为根目录，原静态图实现移动到static目录下。

动态图模型丰富度提升：

发布PP-YOLOv2及PP-YOLO tiny模型，PP-YOLOv2 COCO test数据集精度达到49.5%，V100预测速度达到68.9 FPS

发布旋转框检测模型S2ANet

发布两阶段实用模型PSS-Det

发布人脸检测模型Blazeface

新增基础模块：

新增SENet，GhostNet，Res2Net骨干网络

新增VisualDL训练可视化支持

新增单类别精度计算及PR曲线绘制功能

YOLO系列模型支持NHWC数据格式

预测部署：

发布主要模型的预测benchmark数据

适配TensorRT6，支持TensorRT动态尺寸输入，支持TensorRT int8量化预测

PP-YOLO, YOLOv3, SSD, TTFNet, FCOS, Faster RCNN等7类模型在Linux、Windows、NV Jetson平台下python/cpp/TRT预测部署打通:

检测模型压缩：

蒸馏：新增动态图蒸馏支持，并发布YOLOv3-MobileNetV1蒸馏模型

联合策略：新增动态图剪裁+蒸馏联合策略压缩方案，并发布YOLOv3-MobileNetV1的剪裁+蒸馏压缩模型

问题修复：修复动态图量化模型导出问题

文档：

新增动态图英文文档：包含首页文档，入门使用，快速开始，模型算法、新增数据集等

新增动态图中英文安装文档

新增动态图RCNN系列和YOLO系列配置文件模板及配置项说明文档

Source code(tar.gz)
Source code(zip)
v0.1.0(Nov 27, 2019)
基于PaddlePaddle v1.6.1版本.

模型包括: Faster R-CNN, Mask R-CNN, Faster R-CNN+FPN, Mask R-CNN+FPN, Cascade-Faster-RCNN+FPN, Cascade-Mask-RCNN+FPN, RetinaNet, YOLOv3, SSD，以及人脸检测模型Faceboxes, BlazeFace.

增强版的YOLOv3在COCO上精度达到41.4%，CBResNet200-vd-FPN-Nonlocal模型在COCO上精度达到53.3%，包含行人检测和车辆检测预训练模型

支持sync-bn、多尺度训练、多尺度测试、FP16训练，包含预测benchmark

Source code(tar.gz)
Source code(zip)

Object detection and instance segmentation toolkit based on PaddlePaddle.

Related tags

Overview

PaddleDetection

PaddleDetection 2.0全面升级！目前默认使用动态图版本，静态图版本位于static中

简介

产品动态

特性

套件结构概览

模型性能概览

文档教程

入门教程

进阶教程

模型库

应用案例

第三方教程推荐

版本更新

许可证书

贡献代码

引用

Comments

问题确认 Search before asking

bug描述 Describe the Bug

复现环境 Environment

是否愿意提交PR Are you willing to submit a PR?

问题确认 Search before asking

请提出你的问题 Please ask your question

问题确认 Search before asking

请提出你的问题 Please ask your question

文档链接&描述 Document Links & Description

请提出你的建议 Please give your suggestion

问题确认 Search before asking

请提出你的问题 Please ask your question

问题确认 Search before asking

请提出你的问题 Please ask your question

问题确认 Search before asking

Bug组件 Bug Component

Bug描述 Describe the Bug

复现环境 Environment

Bug描述确认 Bug description confirmation

是否愿意提交PR？ Are you willing to submit a PR?

Releases(v2.5.0)

v2.5.0(Sep 13, 2022)

2.5(08.26/2022)

v2.4.0(Apr 24, 2022)

2.4(03.24/2022)

v2.3.0(Dec 9, 2021)

v2.2.0(Aug 17, 2021)

v2.1.0(May 20, 2021)

v2.0.0(Apr 19, 2021)

2.0(04.15/2021)

v0.1.0(Nov 27, 2019)

Owner

Public implementation of "Learning from Suboptimal Demonstration via Self-Supervised Reward Regression" from CoRL'21

Research code for Arxiv paper "Camera Motion Agnostic 3D Human Pose Estimation"

Official code for "Decoupling Zero-Shot Semantic Segmentation"

Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)

Performant, differentiable reinforcement learning

pip install python-office

Tool cek opsi checkpoint facebook!

DeepVoxels is an object-specific, persistent 3D feature embedding.

RANZCR-CLiP 7th Place Solution

MLPs for Vision and Langauge Modeling (Coming Soon)

Code for "Learning Skeletal Graph Neural Networks for Hard 3D Pose Estimation" ICCV'21

Text-Based Ideal Points

Automatic Video Captioning Evaluation Metric --- EMScore

Rendering Point Clouds with Compute Shaders

PyTorch implementation of DCT fast weight RNNs

Code for the paper "Location-aware Single Image Reflection Removal"

Source code for 2021 ICCV paper "In-the-Wild Single Camera 3D Reconstruction Through Moving Water Surfaces"

Reproduce ResNet-v2(Identity Mappings in Deep Residual Networks) with MXNet

Loopy belief propagation for factor graphs on discrete variables, in JAX!

Action Recognition for Self-Driving Cars