Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+

Last update: Dec 26, 2022

Related tags

Overview

⚡ Yolo-FastestV2 ⚡

Simple, fast, compact, easy to transplant
Less resource occupation, excellent single-core performance, lower power consumption
Faster and smaller:Trade 1% loss of accuracy for 40% increase in inference speed, reducing the amount of parameters by 25%
Fast training speed, low computing power requirements, training only requires 3GB video memory, gtx1660ti training COCO 1 epoch only takes 7 minutes

Evaluating indicator/Benchmark

Network	COCO mAP(0.5)	Resolution	Run Time(4xCore)	Run Time(1xCore)	FLOPs(G)	Params(M)
Yolo-FastestV2	23.56 %	352X352	3.23 ms	4.5 ms	0.238	0.25M
Yolo-FastestV1.1	24.40 %	320X320	5.59 ms	7.52 ms	0.252	0.35M
Yolov4-Tiny	40.2%	416X416	23.67ms	40.14ms	6.9	5.77M

Test platform Mi 11 Snapdragon 888 CPU，Based on NCNN
Reasons for the increase in inference speed: optimization of model memory access
Suitable for hardware with extremely tight computing resources

How to use

Dependent installation

pip3 install -r requirements.txt

Test

Picture test

python3 test.py --data data/coco.data --weights modelzoo/coco2017-epoch-0.235624ap-model.pth --img img/dog.jpg

How to train

Building data sets(The dataset is constructed in the same way as darknet yolo)

The format of the data set is the same as that of Darknet Yolo, Each image corresponds to a .txt label file. The label format is also based on Darknet Yolo's data set label format: "category cx cy wh", where category is the category subscript, cx, cy are the coordinates of the center point of the normalized label box, and w, h are the normalized label box The width and height, .txt label file content example as follows:
```
11 0.344192634561 0.611 0.416430594901 0.262
14 0.509915014164 0.51 0.974504249292 0.972
```

The image and its corresponding label file have the same name and are stored in the same directory. The data file structure is as follows:

.
├── train
│   ├── 000001.jpg
│   ├── 000001.txt
│   ├── 000002.jpg
│   ├── 000002.txt
│   ├── 000003.jpg
│   └── 000003.txt
└── val
    ├── 000043.jpg
    ├── 000043.txt
    ├── 000057.jpg
    ├── 000057.txt
    ├── 000070.jpg
    └── 000070.txt

Generate a dataset path .txt file, the example content is as follows：

train.txt

/home/qiuqiu/Desktop/dataset/train/000001.jpg
/home/qiuqiu/Desktop/dataset/train/000002.jpg
/home/qiuqiu/Desktop/dataset/train/000003.jpg

val.txt

/home/qiuqiu/Desktop/dataset/val/000070.jpg
/home/qiuqiu/Desktop/dataset/val/000043.jpg
/home/qiuqiu/Desktop/dataset/val/000057.jpg

Generate the .names category label file, the sample content is as follows:

category.names
```
person
bicycle
car
motorbike
...
```

The directory structure of the finally constructed training data set is as follows:

.
├── category.names        # .names category label file
├── train                 # train dataset
│   ├── 000001.jpg
│   ├── 000001.txt
│   ├── 000002.jpg
│   ├── 000002.txt
│   ├── 000003.jpg
│   └── 000003.txt
├── train.txt              # train dataset path .txt file
├── val                    # val dataset
│   ├── 000043.jpg
│   ├── 000043.txt
│   ├── 000057.jpg
│   ├── 000057.txt
│   ├── 000070.jpg
│   └── 000070.txt
└── val.txt                # val dataset path .txt file

Get anchor bias

Generate anchor based on current dataset

python3 genanchors.py --traintxt ./train.txt

The anchors6.txt file will be generated in the current directory,the sample content of the anchors6.txt is as follows:

12.64,19.39, 37.88,51.48, 55.71,138.31, 126.91,78.23, 131.57,214.55, 279.92,258.87  # anchor bias
0.636158                                                                             # iou

Build the training .data configuration file

Reference./data/coco.data

[name]
model_name=coco           # model name

[train-configure]
epochs=300                # train epichs
steps=150,250             # Declining learning rate steps
batch_size=64             # batch size
subdivisions=1            # Same as the subdivisions of the darknet cfg file
learning_rate=0.001       # learning rate

[model-configure]
pre_weights=None          # The path to load the model, if it is none, then restart the training
classes=80                # Number of detection categories
width=352                 # The width of the model input image
height=352                # The height of the model input image
anchor_num=3              # anchor num
anchors=12.64,19.39, 37.88,51.48, 55.71,138.31, 126.91,78.23, 131.57,214.55, 279.92,258.87 #anchor bias

[data-configure]
train=/media/qiuqiu/D/coco/train2017.txt   # train dataset path .txt file
val=/media/qiuqiu/D/coco/val2017.txt       # val dataset path .txt file 
names=./data/coco.names                    # .names category label file

Train

Perform training tasks
```
python3 train.py --data data/coco.data
```

Evaluation

Calculate map evaluation

python3 evaluation.py --data data/coco.data --weights modelzoo/coco2017-epoch-0.235624ap-model.pth

Deploy

NCNN

Comments

low precision and and recall

Hello

Im training with only one class from coco dataset, data file is standar only changes anchors and classes to 1

[name]
model_name=coco

[train-configure]
epochs=300
steps=150,250
batch_size=128
subdivisions=1
learning_rate=0.001

[model-configure]
pre_weights=model/backbone/backbone.pth
classes=1
width=352
height=352
anchor_num=3
anchors=8.54,20.34, 25.67,59.99, 52.42,138.38, 103.52,235.28, 197.43,103.53, 238.02,287.40

[data-configure]
train=coco_person/train.txt
val=coco_person/val.txt
names=data/coco.names

I get an AP of 0.41 but with low precision 0.53 and recall of 0.41 that makes that model prediction has lots of false positives.

Why im getting that low precision and recall?

PD. i checked bbox annotations and are correct

Thanks!

opened by natxopedreira 1

测试样例，没找到生成图片文件

下载源码，运行如下命令： python3 test.py --data data/coco.data --weights modelzoo/coco2017-0.241078ap-model.pth --img img/000139.jpg

却没找到test_result.png，指导一下是什么原因？多谢

opened by lixiangMindSpore 1
Anchor Number
I reduce the anchor number from 3 to 2, and there is a problem during training (evaluation):

anchor_boxes[:, :, :, :2] = ((r[:, :, :, :2].sigmoid() * 2. - 0.5) + grid) * stride

RuntimeError: The size of tensor a (2) must match the size of tensor b (3) at non-singleton dimension 3

The model configure is:

[model-configure] pre_weights=None classes=7 width=320 height=320 anchor_num=2 anchors=10.54,9.51, 45.60,40.45, 119.62,95.06, 253.71,138.37
opened by Yuanye-F 1
onnx2ncnn error Gather not supported yet!

(base) ~/Yolo-FastestV2$ python pytorch2onnx.py --data ./data/coco.data --weights modelzoo/coco2017-epoch-0.235624ap-model.pth load param... /home/pc/Yolo-FastestV2/model/backbone/shufflenetv2.py:59: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert (num_channels % 4 == 0)

./onnx2ncnn model.onnx fast.param fast.bin Gather not supported yet!

axis=0

Gather not supported yet!

axis=0

Gather not supported yet!

axis=0

Gather not supported yet!

opened by wavelet2008 1
导出onnx后推理结果和pth不同

使用里面转换onnx的文件得到新的onnx模型后，同时用pth和onnx模型进行测试，发现得到的推理结果不同，使用onnxruntime onnx推理结果是(1,22,22,16)和(1,11,11,16) pth推理得到的是(1,12,22,22),(1,3,22,22),(1,1,22,22) (1,12,11,11),(1,3,11,11),(1,1,11,11) 即使做了处理后得到的最后结果也与pth文件得到的结果不同，不知道大佬能不能指点一下

opened by ifdealer 0
train時發生錯誤，訊息如下

Traceback (most recent call last): File "train.py", line 139, in _, _, AP, _ = utils.utils.evaluation(val_dataloader, cfg, model, device) File "D:\competition\Yolo-FastestV2-main\utils\utils.py", line 367, in evaluation for imgs, targets in pbar: File "C:\anaconda\envs\fire\lib\site-packages\tqdm\std.py", line 1195, in iter for obj in iterable: File "C:\anaconda\envs\fire\lib\site-packages\torch\utils\data\dataloader.py", line 521, in next data = self._next_data() File "C:\anaconda\envs\fire\lib\site-packages\torch\utils\data\dataloader.py", line 1203, in _next_data return self._process_data(data) File "C:\anaconda\envs\fire\lib\site-packages\torch\utils\data\dataloader.py", line 1229, in _process_data data.reraise() File "C:\anaconda\envs\fire\lib\site-packages\torch_utils.py", line 434, in reraise raise exception Exception: Caught Exception in DataLoader worker process 0. Original Traceback (most recent call last): File "C:\anaconda\envs\fire\lib\site-packages\torch\utils\data_utils\worker.py", line 287, in _worker_loop data = fetcher.fetch(index) File "C:\anaconda\envs\fire\lib\site-packages\torch\utils\data_utils\fetch.py", line 49, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "C:\anaconda\envs\fire\lib\site-packages\torch\utils\data_utils\fetch.py", line 49, in data = [self.dataset[idx] for idx in possibly_batched_index] File "D:\competition\Yolo-FastestV2-main\utils\datasets.py", line 127, in getitem raise Exception("%s is not exist" % label_path) Exception: .txt is not exist

opened by richardlotw 4

Releases(V0.2)

V0.2(Aug 11, 2021)

Source code(tar.gz)
Source code(zip)
V0.1(Aug 11, 2021)

Source code(tar.gz)
Source code(zip)

Owner

qiuqiuqiuqiu ...球

GitHub Repository

Blind visual quality assessment on 360° Video based on progressive learning

Blind visual quality assessment on omnidirectional or 360 video (ProVQA) Blind VQA for 360° Video via Progressively Learning from Pixels, Frames and V

5 Jan 06, 2023

Model parallel transformers in Jax and Haiku

Mesh Transformer Jax A haiku library using the new(ly documented) xmap operator in Jax for model parallelism of transformers. See enwik8_example.py fo

4.8k Jan 01, 2023

This repository allows you to anonymize sensitive information in images/videos. The solution is fully compatible with the DL-based training/inference solutions that we already published/will publish for Object Detection and Semantic Segmentation.

BMW-Anonymization-Api Data privacy and individuals’ anonymity are and always have been a major concern for data-driven companies. Therefore, we design

148 Dec 21, 2022

This repository contains an overview of important follow-up works based on the original Vision Transformer (ViT) by Google.

75 Dec 02, 2022

A collection of inference modules for fastai2

fastinference A collection of inference modules for fastai including inference speedup and interpretability Install pip install fastinference There ar

83 Oct 10, 2022

Julia package for contraction of tensor networks, based on the sweep line algorithm outlined in the paper General tensor network decoding of 2D Pauli codes

35 Dec 21, 2022

🛠 All-in-one web-based IDE specialized for machine learning and data science.

All-in-one web-based development environment for machine learning Getting Started • Features & Screenshots • Support • Report a Bug • FAQ • Known Issu

2.9k Jan 09, 2023

NeuroFind - A solution to the to the Task given by the Oberseminar of Messtechnik Institute of TU Dresden in 2021

NeuroFind A solution to the to the Task given by the Oberseminar of Messtechnik

1 Jan 20, 2022

Physics-Aware Training (PAT) is a method to train real physical systems with backpropagation.

Physics-Aware Training (PAT) is a method to train real physical systems with backpropagation. It was introduced in Wright, Logan G. & Onodera, Tatsuhiro et al. (2021)1 to train Physical Neural Networ

230 Jan 05, 2023

Receptive Field Block Net for Accurate and Fast Object Detection, ECCV 2018

Receptive Field Block Net for Accurate and Fast Object Detection By Songtao Liu, Di Huang, Yunhong Wang Updatas (2021/07/23): YOLOX is here!, stronger

1.4k Dec 21, 2022

Deep Convolutional Generative Adversarial Networks

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks Alec Radford, Luke Metz, Soumith Chintala All images in t

3.4k Dec 29, 2022

Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning"

VANET Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning" Introduction This is the implementation of article VAN

23 Dec 26, 2022

Unofficial PyTorch implementation of SimCLR by Google Brain

2 Oct 13, 2021

Code for ICCV2021 paper SPEC: Seeing People in the Wild with an Estimated Camera

SPEC: Seeing People in the Wild with an Estimated Camera [ICCV 2021] SPEC: Seeing People in the Wild with an Estimated Camera, Muhammed Kocabas, Chun-

187 Dec 26, 2022

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech Jaehyeon Kim, Jungil Kong, and Juhee Son In our rece

1.7k Jan 08, 2023

Basit bir burç modülü.

Bu modulu burclar hakkinda gundelik bir sekilde bilgi alin diye yaptim ve sizler icin kullanima sunuyorum. Modulun kullanimi asiri basit: Ornek Kullan

17 Jun 08, 2022

PyTorch implementation of our paper: Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition, arxiv This is a PyTorch implementation of our paper. 1. Re

11 Nov 19, 2022

Self-Guided Contrastive Learning for BERT Sentence Representations

Self-Guided Contrastive Learning for BERT Sentence Representations This repository is dedicated for releasing the implementation of the models utilize

16 Dec 04, 2022

PyTorch implementation of Algorithm 1 of "On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models"

Code for On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models This repository will reproduce the main results from our pape

32 Nov 25, 2022

Pytorch implementation of NeurIPS 2021 paper: Geometry Processing with Neural Fields.

Geometry Processing with Neural Fields Pytorch implementation for the NeurIPS 2021 paper: Geometry Processing with Neural Fields Guandao Yang, Serge B

162 Dec 16, 2022

Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+

Related tags

Overview

⚡ Yolo-FastestV2 ⚡

Evaluating indicator/Benchmark

How to use

Dependent installation

Test

How to train

Building data sets(The dataset is constructed in the same way as darknet yolo)

Get anchor bias

Build the training .data configuration file

Train

Evaluation

Deploy

NCNN

Comments

low precision and and recall

测试样例，没找到生成图片文件

Anchor Number

onnx2ncnn error Gather not supported yet!

axis=0

axis=0

axis=0

导出onnx后推理结果和pth不同

train時發生錯誤，訊息如下

Releases(V0.2)

V0.2(Aug 11, 2021)

V0.1(Aug 11, 2021)

Owner

Blind visual quality assessment on 360° Video based on progressive learning

Model parallel transformers in Jax and Haiku

This repository allows you to anonymize sensitive information in images/videos. The solution is fully compatible with the DL-based training/inference solutions that we already published/will publish for Object Detection and Semantic Segmentation.

This repository contains an overview of important follow-up works based on the original Vision Transformer (ViT) by Google.

A collection of inference modules for fastai2

Julia package for contraction of tensor networks, based on the sweep line algorithm outlined in the paper General tensor network decoding of 2D Pauli codes

🛠 All-in-one web-based IDE specialized for machine learning and data science.

NeuroFind - A solution to the to the Task given by the Oberseminar of Messtechnik Institute of TU Dresden in 2021

Physics-Aware Training (PAT) is a method to train real physical systems with backpropagation.

Receptive Field Block Net for Accurate and Fast Object Detection, ECCV 2018

Deep Convolutional Generative Adversarial Networks

Code reproduce for paper "Vehicle Re-identification with Viewpoint-aware Metric Learning"

Unofficial PyTorch implementation of SimCLR by Google Brain

Code for ICCV2021 paper SPEC: Seeing People in the Wild with an Estimated Camera

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Basit bir burç modülü.

PyTorch implementation of our paper: Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

Self-Guided Contrastive Learning for BERT Sentence Representations

PyTorch implementation of Algorithm 1 of "On the Anatomy of MCMC-Based Maximum Likelihood Learning of Energy-Based Models"

Pytorch implementation of NeurIPS 2021 paper: Geometry Processing with Neural Fields.