I tried to apply the CAM algorithm to YOLOv4 and it worked.

Overview

YOLOV4:You Only Look Once目标检测模型在pytorch当中的实现


2021年2月7日更新:
加入letterbox_image的选项,关闭letterbox_image后网络的map得到大幅度提升。

目录

  1. 性能情况 Performance
  2. 实现的内容 Achievement
  3. 所需环境 Environment
  4. 注意事项 Attention
  5. 小技巧的设置 TricksSet
  6. 文件下载 Download
  7. 预测步骤 How2predict
  8. 训练步骤 How2train
  9. 参考资料 Reference

性能情况

训练数据集 权值文件名称 测试数据集 输入图片大小 mAP 0.5:0.95 mAP 0.5
VOC07+12+COCO yolo4_voc_weights.pth VOC-Test07 416x416 - 89.0
COCO-Train2017 yolo4_weights.pth COCO-Val2017 416x416 46.1 70.2

实现的内容

  • 主干特征提取网络:DarkNet53 => CSPDarkNet53
  • 特征金字塔:SPP,PAN
  • 训练用到的小技巧:Mosaic数据增强、Label Smoothing平滑、CIOU、学习率余弦退火衰减
  • 激活函数:使用Mish激活函数
  • ……balabla

所需环境

torch==1.2.0

注意事项

代码中的yolo4_weights.pth是基于608x608的图片训练的,但是由于显存原因。我将代码中的图片大小修改成了416x416。有需要的可以修改回来。 代码中的默认anchors是基于608x608的图片的。
注意不要使用中文标签,文件夹中不要有空格!
在训练前需要务必在model_data下新建一个txt文档,文档中输入需要分的类,在train.py中将classes_path指向该文件

小技巧的设置

在train.py文件下:
1、mosaic参数可用于控制是否实现Mosaic数据增强。
2、Cosine_scheduler可用于控制是否使用学习率余弦退火衰减。
3、label_smoothing可用于控制是否Label Smoothing平滑。

文件下载

训练所需的yolo4_weights.pth可在百度网盘中下载。
链接: https://pan.baidu.com/s/1WlDNPtGO1pwQbqwKx1gRZA 提取码: p4sc
yolo4_weights.pth是coco数据集的权重。
yolo4_voc_weights.pth是voc数据集的权重。

预测步骤

a、使用预训练权重

  1. 下载完库后解压,在百度网盘下载yolo4_weights.pth或者yolo4_voc_weights.pth,放入model_data,运行predict.py,输入
img/street.jpg
  1. 利用video.py可进行摄像头检测。

b、使用自己训练的权重

  1. 按照训练步骤训练。
  2. 在yolo.py文件里面,在如下部分修改model_path和classes_path使其对应训练好的文件;model_path对应logs文件夹下面的权值文件,classes_path是model_path对应分的类
_defaults = {
    "model_path": 'model_data/yolo4_weights.pth',
    "anchors_path": 'model_data/yolo_anchors.txt',
    "classes_path": 'model_data/coco_classes.txt',
    "model_image_size" : (416, 416, 3),
    "confidence": 0.5,
    "cuda": True
}
  1. 运行predict.py,输入
img/street.jpg
  1. 利用video.py可进行摄像头检测。

训练步骤

  1. 本文使用VOC格式进行训练。
  2. 训练前将标签文件放在VOCdevkit文件夹下的VOC2007文件夹下的Annotation中。
  3. 训练前将图片文件放在VOCdevkit文件夹下的VOC2007文件夹下的JPEGImages中。
  4. 在训练前利用voc2yolo4.py文件生成对应的txt。
  5. 再运行根目录下的voc_annotation.py,运行前需要将classes改成你自己的classes。注意不要使用中文标签,文件夹中不要有空格!
classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
  1. 此时会生成对应的2007_train.txt,每一行对应其图片位置及其真实框的位置
  2. 在训练前需要务必在model_data下新建一个txt文档,文档中输入需要分的类,在train.py中将classes_path指向该文件,示例如下:
classes_path = 'model_data/new_classes.txt'    

model_data/new_classes.txt文件内容为:

cat
dog
...
  1. 运行train.py即可开始训练。

mAP目标检测精度计算更新

更新了get_gt_txt.py、get_dr_txt.py和get_map.py文件。
get_map文件克隆自https://github.com/Cartucho/mAP
具体mAP计算过程可参考:https://www.bilibili.com/video/BV1zE411u7Vw

Reference

https://github.com/qqwweee/keras-yolo3/
https://github.com/Cartucho/mAP
https://github.com/Ma-Dan/keras-yolo4

The above is original readme.md

My work

I tried to train the YOLOv4 to detect the helmet and it's color. In order to know whether it learned well, I visualized the output of the YOLO-Head.

origin.jpg detection.jpg
origin.jpg detection.jpg
head0 head1 head2
head0layer0score.jpg head1layer0score.jpg head2layer0score.jpg
head0layer1class.jpg head1layer1class.jpg head2layer1class.jpg
head0layer2class_score.jpg head1layer2class_score.jpg head2layer2class_score.jpg

The above are shown the visualization of the "yellow", we can easily see the hot area focus on the yellow helmets. So I think this can help us to train the model to a certain extent.

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

The source code of the ICCV2021 paper "PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering"

Ren Yurui 261 Jan 09, 2023
An self sufficient AI that crawls the web to learn how to generate art from keywords

Roxx-IO - The Smart Artist AI! TO DO / IDEAS Implement Web-Scraping Functionality Figure out a less annoying (and an off button for it) text to speech

Tatz 5 Mar 21, 2022
Code release for the paper “Worldsheet Wrapping the World in a 3D Sheet for View Synthesis from a Single Image”, ICCV 2021.

Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image This repository contains the code for the following paper: R. Hu,

Meta Research 37 Jan 04, 2023
A collection of semantic image segmentation models implemented in TensorFlow

A collection of semantic image segmentation models implemented in TensorFlow. Contains data-loaders for the generic and medical benchmark datasets.

bobby 16 Dec 06, 2019
Denoising Diffusion Implicit Models

Denoising Diffusion Implicit Models (DDIM) Jiaming Song, Chenlin Meng and Stefano Ermon, Stanford Implements sampling from an implicit model that is t

465 Jan 05, 2023
Meta graph convolutional neural network-assisted resilient swarm communications

Resilient UAV Swarm Communications with Graph Convolutional Neural Network This repository contains the source codes of Resilient UAV Swarm Communicat

62 Dec 06, 2022
ML-Decoder: Scalable and Versatile Classification Head

ML-Decoder: Scalable and Versatile Classification Head Paper Official PyTorch Implementation Tal Ridnik, Gilad Sharir, Avi Ben-Cohen, Emanuel Ben-Baru

189 Jan 04, 2023
Code accompanying the NeurIPS 2021 paper "Generating High-Quality Explanations for Navigation in Partially-Revealed Environments"

Generating High-Quality Explanations for Navigation in Partially-Revealed Environments This work presents an approach to explainable navigation under

RAIL Group @ George Mason University 1 Oct 28, 2022
2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

TableMASTER-mmocr Contents About The Project Method Description Dependency Getting Started Prerequisites Installation Usage Data preprocess Train Infe

Jianquan Ye 298 Dec 21, 2022
Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly

Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then Training It Toughly Code for this paper Ultra-Data-Efficient GAN Tra

VITA 77 Oct 05, 2022
Official implementations of PSENet, PAN and PAN++.

News (2021/11/03) Paddle implementation of PAN, see Paddle-PANet. Thanks @simplify23. (2021/04/08) PSENet and PAN are included in MMOCR. Introduction

395 Dec 14, 2022
Implementation of ToeplitzLDA for spatiotemporal stationary time series data.

Code for the ToeplitzLDA classifier proposed in here. The classifier conforms sklearn and can be used as a drop-in replacement for other LDA classifiers. For in-depth usage refer to the learning from

Jan Sosulski 5 Nov 07, 2022
Diverse Image Generation via Self-Conditioned GANs

Diverse Image Generation via Self-Conditioned GANs Project | Paper Diverse Image Generation via Self-Conditioned GANs Steven Liu, Tongzhou Wang, David

Steven Liu 147 Dec 03, 2022
This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation".

ObjProp Introduction This is the official implementation of the paper "Object Propagation via Inter-Frame Attentions for Temporally Stable Video Insta

Anirudh S Chakravarthy 6 May 03, 2022
A small tool to joint picture including gif

README 做设计的时候遇到拼接长图的情况,但是发现没有什么好用的能拼接gif的工具。 于是自己写了个gif拼接小工具。 可以自动拼接gif、png和jpg等常见格式。 效果 从上至下 从下至上 从左至右 从右至左 使用 克隆仓库 git clone https://github.com/Dels

3 Dec 15, 2021
A comprehensive and up-to-date developer education platform for Urbit.

curriculum A comprehensive and up-to-date developer education platform for Urbit. This project organizes developer capabilities into a hierarchy of co

Sigilante 36 Oct 04, 2022
UNAVOIDS: Unsupervised and Nonparametric Approach for Visualizing Outliers and Invariant Detection Scoring

UNAVOIDS: Unsupervised and Nonparametric Approach for Visualizing Outliers and Invariant Detection Scoring Code Summary aggregate.py: this script aggr

1 Dec 28, 2021
HAR-stacked-residual-bidir-LSTMs - Deep stacked residual bidirectional LSTMs for HAR

HAR-stacked-residual-bidir-LSTM The project is based on this repository which is presented as a tutorial. It consists of Human Activity Recognition (H

Guillaume Chevalier 287 Dec 27, 2022
Source code for our paper "Improving Empathetic Response Generation by Recognizing Emotion Cause in Conversations"

Source code for our paper "Improving Empathetic Response Generation by Recognizing Emotion Cause in Conversations" this repository is maintained by bo

Yuhan Liu 24 Nov 29, 2022
Implementation of Hourglass Transformer, in Pytorch, from Google and OpenAI

Hourglass Transformer - Pytorch (wip) Implementation of Hourglass Transformer, in Pytorch. It will also contain some of my own ideas about how to make

Phil Wang 61 Dec 25, 2022