I tried to apply the CAM algorithm to YOLOv4 and it worked.

Overview

YOLOV4:You Only Look Once目标检测模型在pytorch当中的实现


2021年2月7日更新:
加入letterbox_image的选项,关闭letterbox_image后网络的map得到大幅度提升。

目录

  1. 性能情况 Performance
  2. 实现的内容 Achievement
  3. 所需环境 Environment
  4. 注意事项 Attention
  5. 小技巧的设置 TricksSet
  6. 文件下载 Download
  7. 预测步骤 How2predict
  8. 训练步骤 How2train
  9. 参考资料 Reference

性能情况

训练数据集 权值文件名称 测试数据集 输入图片大小 mAP 0.5:0.95 mAP 0.5
VOC07+12+COCO yolo4_voc_weights.pth VOC-Test07 416x416 - 89.0
COCO-Train2017 yolo4_weights.pth COCO-Val2017 416x416 46.1 70.2

实现的内容

  • 主干特征提取网络:DarkNet53 => CSPDarkNet53
  • 特征金字塔:SPP,PAN
  • 训练用到的小技巧:Mosaic数据增强、Label Smoothing平滑、CIOU、学习率余弦退火衰减
  • 激活函数:使用Mish激活函数
  • ……balabla

所需环境

torch==1.2.0

注意事项

代码中的yolo4_weights.pth是基于608x608的图片训练的,但是由于显存原因。我将代码中的图片大小修改成了416x416。有需要的可以修改回来。 代码中的默认anchors是基于608x608的图片的。
注意不要使用中文标签,文件夹中不要有空格!
在训练前需要务必在model_data下新建一个txt文档,文档中输入需要分的类,在train.py中将classes_path指向该文件

小技巧的设置

在train.py文件下:
1、mosaic参数可用于控制是否实现Mosaic数据增强。
2、Cosine_scheduler可用于控制是否使用学习率余弦退火衰减。
3、label_smoothing可用于控制是否Label Smoothing平滑。

文件下载

训练所需的yolo4_weights.pth可在百度网盘中下载。
链接: https://pan.baidu.com/s/1WlDNPtGO1pwQbqwKx1gRZA 提取码: p4sc
yolo4_weights.pth是coco数据集的权重。
yolo4_voc_weights.pth是voc数据集的权重。

预测步骤

a、使用预训练权重

  1. 下载完库后解压,在百度网盘下载yolo4_weights.pth或者yolo4_voc_weights.pth,放入model_data,运行predict.py,输入
img/street.jpg
  1. 利用video.py可进行摄像头检测。

b、使用自己训练的权重

  1. 按照训练步骤训练。
  2. 在yolo.py文件里面,在如下部分修改model_path和classes_path使其对应训练好的文件;model_path对应logs文件夹下面的权值文件,classes_path是model_path对应分的类
_defaults = {
    "model_path": 'model_data/yolo4_weights.pth',
    "anchors_path": 'model_data/yolo_anchors.txt',
    "classes_path": 'model_data/coco_classes.txt',
    "model_image_size" : (416, 416, 3),
    "confidence": 0.5,
    "cuda": True
}
  1. 运行predict.py,输入
img/street.jpg
  1. 利用video.py可进行摄像头检测。

训练步骤

  1. 本文使用VOC格式进行训练。
  2. 训练前将标签文件放在VOCdevkit文件夹下的VOC2007文件夹下的Annotation中。
  3. 训练前将图片文件放在VOCdevkit文件夹下的VOC2007文件夹下的JPEGImages中。
  4. 在训练前利用voc2yolo4.py文件生成对应的txt。
  5. 再运行根目录下的voc_annotation.py,运行前需要将classes改成你自己的classes。注意不要使用中文标签,文件夹中不要有空格!
classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]
  1. 此时会生成对应的2007_train.txt,每一行对应其图片位置及其真实框的位置
  2. 在训练前需要务必在model_data下新建一个txt文档,文档中输入需要分的类,在train.py中将classes_path指向该文件,示例如下:
classes_path = 'model_data/new_classes.txt'    

model_data/new_classes.txt文件内容为:

cat
dog
...
  1. 运行train.py即可开始训练。

mAP目标检测精度计算更新

更新了get_gt_txt.py、get_dr_txt.py和get_map.py文件。
get_map文件克隆自https://github.com/Cartucho/mAP
具体mAP计算过程可参考:https://www.bilibili.com/video/BV1zE411u7Vw

Reference

https://github.com/qqwweee/keras-yolo3/
https://github.com/Cartucho/mAP
https://github.com/Ma-Dan/keras-yolo4

The above is original readme.md

My work

I tried to train the YOLOv4 to detect the helmet and it's color. In order to know whether it learned well, I visualized the output of the YOLO-Head.

origin.jpg detection.jpg
origin.jpg detection.jpg
head0 head1 head2
head0layer0score.jpg head1layer0score.jpg head2layer0score.jpg
head0layer1class.jpg head1layer1class.jpg head2layer1class.jpg
head0layer2class_score.jpg head1layer2class_score.jpg head2layer2class_score.jpg

The above are shown the visualization of the "yellow", we can easily see the hot area focus on the yellow helmets. So I think this can help us to train the model to a certain extent.

HiFT: Hierarchical Feature Transformer for Aerial Tracking (ICCV2021)

HiFT: Hierarchical Feature Transformer for Aerial Tracking Ziang Cao, Changhong Fu, Junjie Ye, Bowen Li, and Yiming Li Our paper is Accepted by ICCV 2

Intelligent Vision for Robotics in Complex Environment 55 Nov 23, 2022
Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization' (ICCV-21 Oral)

Learning-Action-Completeness-from-Points Official Pytorch Implementation of 'Learning Action Completeness from Points for Weakly-supervised Temporal A

Pilhyeon Lee 67 Jan 03, 2023
Implementation of Self-supervised Graph-level Representation Learning with Local and Global Structure (ICML 2021).

Self-supervised Graph-level Representation Learning with Local and Global Structure Introduction This project is an implementation of ``Self-supervise

MilaGraph 50 Dec 09, 2022
This project is a re-implementation of MASTER: Multi-Aspect Non-local Network for Scene Text Recognition by MMOCR

This project is a re-implementation of MASTER: Multi-Aspect Non-local Network for Scene Text Recognition by MMOCR,which is an open-source toolbox based on PyTorch. The overall architecture will be sh

Jianquan Ye 82 Nov 17, 2022
A new benchmark for Icon Question Answering (IconQA) and a large-scale icon dataset Icon645.

IconQA About IconQA is a new diverse abstract visual question answering dataset that highlights the importance of abstract diagram understanding and c

Pan Lu 24 Dec 30, 2022
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Pyserini Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations. Retrieval using sparse re

Castorini 706 Dec 29, 2022
PyTorch implementation of SIFT descriptor

This is an differentiable pytorch implementation of SIFT patch descriptor. It is very slow for describing one patch, but quite fast for batch. It can

Dmytro Mishkin 150 Dec 24, 2022
PyTorch implementation of PSPNet segmentation network

pspnet-pytorch PyTorch implementation of PSPNet segmentation network Original paper Pyramid Scene Parsing Network Details This is a slightly different

Roman Trusov 532 Dec 29, 2022
High frequency AI based algorithmic trading module.

Flow Flow is a high frequency algorithmic trading module that uses machine learning to self regulate and self optimize for maximum return. The current

59 Dec 14, 2022
Analyzing basic network responses to novel classes

novelty-detection Analyzing how AlexNet responds to novel classes with varying degrees of similarity to pretrained classes from ImageNet. If you find

Noam Eshed 34 Oct 02, 2022
Code release for Universal Domain Adaptation(CVPR 2019)

Universal Domain Adaptation Code release for Universal Domain Adaptation(CVPR 2019) Requirements python 3.6+ PyTorch 1.0 pip install -r requirements.t

THUML @ Tsinghua University 229 Dec 23, 2022
某学校选课系统GIF验证码数据集 + Baseline模型 + 上下游相关工具

elective-dataset-2021spring 某学校2021春季选课系统GIF验证码数据集(29338张) + 准确率98.4%的Baseline模型 + 上下游相关工具。 数据集采用 知识共享署名-非商业性使用 4.0 国际许可协议 进行许可。 Baseline模型和上下游相关工具采用

xmcp 27 Sep 17, 2021
Seach Losses of our paper 'Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search', accepted by ICLR 2021.

CSE-Autoloss Designing proper loss functions for vision tasks has been a long-standing research direction to advance the capability of existing models

Peidong Liu(刘沛东) 54 Dec 17, 2022
Active and Sample-Efficient Model Evaluation

Active Testing: Sample-Efficient Model Evaluation Hi, good to see you here! 👋 This is code for "Active Testing: Sample-Efficient Model Evaluation". P

Jannik Kossen 19 Oct 30, 2022
Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"

ResDAVEnet-VQ Official PyTorch implementation of Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech What is in this repo? M

Wei-Ning Hsu 21 Aug 23, 2022
Gym environments used in the paper: "Developmental Reinforcement Learning of Control Policy of a Quadcopter UAV with Thrust Vectoring Rotors"

gym_multirotor Gym to train reinforcement learning agents on UAV platforms Quadrotor Tiltrotor Requirements This package has been tested on Ubuntu 18.

Aditya M. Deshpande 19 Dec 29, 2022
Repository for the Bias Benchmark for QA dataset.

BBQ Repository for the Bias Benchmark for QA dataset. Authors: Alicia Parrish, Angelica Chen, Nikita Nangia, Vishakh Padmakumar, Jason Phang, Jana Tho

ML² AT CILVR 18 Nov 18, 2022
You Only Look Once for Panopitic Driving Perception

You Only 👀 Once for Panoptic 🚗 Perception You Only Look at Once for Panoptic driving Perception by Dong Wu, Manwen Liao, Weitian Zhang, Xinggang Wan

Hust Visual Learning Team 1.4k Jan 04, 2023
Simple (but Strong) Baselines for POMDPs

Recurrent Model-Free RL is a Strong Baseline for Many POMDPs Welcome to the POMDP world! This repo provides some simple baselines for POMDPs, specific

Tianwei V. Ni 172 Dec 29, 2022
Tracking Progress in Question Answering over Knowledge Graphs

Tracking Progress in Question Answering over Knowledge Graphs Table of contents Question Answering Systems with Descriptions The QA Systems Table cont

Knowledge Graph Question Answering 47 Jan 02, 2023