This repo is customed for VisDrone.

Last update: Jul 17, 2022

Overview

Object Detection for VisDrone(无人机航拍图像目标检测)

My environment

1、Windows10 (Linux available)
2、tensorflow >= 1.12.0
3、python3.6 (anaconda)
4、cv2
5、ensemble-boxes(pip install ensemble-boxes)

Datasets(XML format for training set)

(1).Datasets is available on https://github.com/VisDrone/VisDrone-Dataset
(2).Please download xml annotations on Baidu Yun (提取码: ia3f), or Google Drive, and configure it in ./core/config/cfgs.py
(3).You can also use ./data/visdrone2xml.py to generate your visdrone xml files, modify the path information.

training-set format:

├── VisDrone2019-DET-train
│     ├── Annotation(xml format)
│     ├── JPEGImages

Pretrained Models(ResNet50vd, 101vd)

Please download pretrained models on Baidu Yun (提取码: krce), or Google Drive, then put it into ./data/pretrained_weights

Train

Modify the parameters in ./core/config/cfgs.py
python train_step.py

Eval

Modify the parameters in ./core/config/cfgs.py
python eval_visdrone.py, it will get txt format file, then use official matlab tools to eval the final results.
python eval_model_ensemble.py. Before the running of this file, you should set NORMALIZED_RESULTS_FOR_MODEL_ENSEMBLE=True in cfgs.py and then run eval_visdrone.py to get normalized txt result.

Visualization

Modify the parameters in ./core/config/cfgs.py
python image_demo.py, it will get visualized results.

Visualized Result (multi-scale training+multi-scale testing)

Test Result(Validation set)：

1. ResNet50-vd

Name	maxDets	Result(s/m)
Average Precision (AP) @( IoU=0.50:0.95)	maxDets=500	31.26%/35.1%
Average Precision (AP) @( IoU=0.50 )	maxDets=500	56.44%/60.29%
Average Precision (AP) @( IoU=0.75 )	maxDets=500	30.13%/35.42%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets= 1	0.78%/0.58%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets= 10	6.62%/6.05%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets=100	38.21%/40.99%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets=500	48.41%/53%

"s" means single-scale training + single-scale testing; "m"means multi-scale training + multi-scale testing

2. ResNet101-vd

Name	maxDets	Result(s/m)
Average Precision (AP) @( IoU=0.50:0.95)	maxDets=500	31.7%/35.98%
Average Precision (AP) @( IoU=0.50 )	maxDets=500	56.94%/61.64%
Average Precision (AP) @( IoU=0.75 )	maxDets=500	30.59%/36.13%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets= 1	0.67%/0.61%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets= 10	6.29%/6.13%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets=100	38.66%/42.33%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets=500	49.29%/53.68%

3. Model Ensemble (ResNet101-vd+ResNet50-vd)

Name	maxDets	Result
Average Precision (AP) @( IoU=0.50:0.95)	maxDets=500	36.76%
Average Precision (AP) @( IoU=0.50 )	maxDets=500	62.33%
Average Precision (AP) @( IoU=0.75 )	maxDets=500	37.41%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets= 1	0.59%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets= 10	6.06%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets=100	42.57%
Average Recall (AR) @( IoU=0.50:0.95)	maxDets=500	54.53%

You can download trained weights(ResNet50vd, 101vd) on Baidu Yun (提取码: 9u9m), or Google Drive, then put it into ./saved_weights

Reference

1、https://github.com/DetectionTeamUCAS/Faster-RCNN_Tensorflow
2、https://github.com/open-mmlab/mmdetection
3、https://github.com/ZFTurbo/Weighted-Boxes-Fusion
4、https://github.com/kobiso/CBAM-tensorflow-slim
5、https://github.com/SJTU-Thinklab-Det/DOTA-DOAI
6、https://github.com/Viredery/tf-eager-fasterrcnn
7、https://github.com/VisDrone/VisDrone2018-DET-toolkit
8、https://github.com/YunYang1994/tensorflow-yolov3
9、https://github.com/zhpmatrix/VisDrone2018

This repo is customed for VisDrone.

Related tags

Overview

Object Detection for VisDrone(无人机航拍图像目标检测)

My environment

Datasets(XML format for training set)

Pretrained Models(ResNet50vd, 101vd)

Train

Eval

Visualization

Test Result(Validation set)：

1. ResNet50-vd

"s" means single-scale training + single-scale testing; "m"means multi-scale training + multi-scale testing

2. ResNet101-vd

3. Model Ensemble (ResNet101-vd+ResNet50-vd)

You can download trained weights(ResNet50vd, 101vd) on Baidu Yun (提取码: 9u9m), or Google Drive, then put it into ./saved_weights

Reference

Owner

Text completion with Hugging Face and TensorFlow.js running on Node.js

A PyTorch Reimplementation of TecoGAN: Temporally Coherent GAN for Video Super-Resolution

Official implementation of "Dynamic Anchor Learning for Arbitrary-Oriented Object Detection" (AAAI2021).

This repository contains code to run experiments in the paper "Signal Strength and Noise Drive Feature Preference in CNN Image Classifiers."

ConvMixer unofficial implementation

Source code and data from the RecSys 2020 article "Carousel Personalization in Music Streaming Apps with Contextual Bandits" by W. Bendada, G. Salha and T. Bontempelli

MVGCN: a novel multi-view graph convolutional network (MVGCN) framework for link prediction in biomedical bipartite networks.

A Joint Video and Image Encoder for End-to-End Retrieval

A PyTorch Implementation of Neural IMage Assessment

The codes of paper 'Active-LATHE: An Active Learning Algorithm for Boosting the Error exponent for Learning Homogeneous Ising Trees'

A computer vision pipeline to identify the "icons" in Christian paintings

Vector Neurons: A General Framework for SO(3)-Equivariant Networks

Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Learning with Subset Stacking

Using Self-Supervised Pretext Tasks for Active Learning - Official Pytorch Implementation

Sandbox for training deep learning networks

This repository is for Contrastive Embedding Distribution Refinement and Entropy-Aware Attention Network (CEDR)

Get 2D point positions (e.g., facial landmarks) projected on 3D mesh

Various operations like path tracking, counting, etc by using yolov5

Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision. ICCV 2021.