A PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detection.

Overview

R-YOLOv4

This is a PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detection. (Final project for NCKU INTRODUCTION TO ARTIFICIAL INTELLIGENCE course)

Introduction

The objective of this project is to adapt YOLOv4 model to detecting oriented objects. As a result, modifying the original loss function of the model is required. I got a successful result by increasing the number of anchor boxes with different rotating angle and combining smooth-L1-IoU loss function proposed by R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object into the original loss for bounding boxes.

Features


Loss Function (only for x, y, w, h, theta)

loss

angle


Scheduler

Cosine Annealing with Warmup (Reference: Cosine Annealing with Warmup for PyTorch)
scheduler


Recall

recall

As the paper suggested, I get a better results from **f(ariou) = exp(1-ariou)-1**. Therefore I used it for my loss function.

Usage

  1. Clone and Setup Environment

    $ git clone https://github.com/kunnnnethan/R-YOLOv4.git
    $ cd R-YOLOv4/
    

    Create Conda Environment

    $ conda env create -f environment.yml
    

    Create Python Virtual Environment

    $ python3.8 -m venv (your environment name)
    $ source ~/your-environment-name/bin/activate
    $ pip3 install torch torchvision torchaudio
    $ pip install -r requirements.txt
    
  2. Download pretrained weights
    weights

  3. Make sure your files arrangment looks like the following
    Note that each of your dataset folder in data should split into three files, namely train, test, and detect.

    R-YOLOv4/
    ├── train.py
    ├── test.py
    ├── detect.py
    ├── xml2txt.py
    ├── environment.xml
    ├── requirements.txt
    ├── model/
    ├── datasets/
    ├── lib/
    ├── outputs/
    ├── weights/
        ├── pretrained/ (for training)
        └── UCAS-AOD/ (for testing and detection)
    └── data/
        └── UCAS-AOD/
            ├── class.names
            ├── train/
                ├── ...png
                └── ...txt
            ├── test/
                ├── ...png
                └── ...txt
            └── detect/
                └── ...png
    
  4. Train, Test, and Detect
    Please refer to lib/options.py to check out all the arguments.

Train

I have implemented methods to load and train three different datasets. They are UCAS-AOD, DOTA, and custom dataset respectively. You can check out how I loaded those dataset into the model at /datasets. The angle of each bounding box is limited in (- pi/2, pi/2], and the height of each bounding box is always longer than it's width.

You can run experiments/display_inputs.py to visualize whether your data is loaded successfully.

UCAS-AOD dataset

Please refer to this repository to rearrange files so that it can be loaded and trained by this model.
You can download the weight that I trained from UCAS-AOD.

While training, please specify which dataset you are using.
$ python train.py --dataset UCAS_AOD

DOTA dataset

Download the official dataset from here. The original files should be able to be loaded and trained by this model.

While training, please specify which dataset you are using.
$ python train.py --dataset DOTA

Train with custom dataset

  1. Use labelImg2 to help label your data. labelImg2 is capable of labeling rotated objects.
  2. Move your data folder into the R-YOLOv4/data folder.
  3. Run xml2txt.py
    1. generate txt files: python xml2txt.py --data_folder your-path --action gen_txt
    2. delete xml files: python xml2txt.py --data_folder your-path --action del_xml

A trash custom dataset that I made and the weight trained from it are provided for your convenience.

While training, please specify which dataset you are using.
$ python train.py --dataset custom

Training Log

---- [Epoch 2/2] ----
+---------------+--------------------+---------------------+---------------------+----------------------+
| Step: 596/600 | loss               | reg_loss            | conf_loss           | cls_loss             |
+---------------+--------------------+---------------------+---------------------+----------------------+
| YoloLayer1    | 0.4302629232406616 | 0.32991039752960205 | 0.09135108441114426 | 0.009001442231237888 |
| YoloLayer2    | 0.7385762333869934 | 0.5682911276817322  | 0.15651139616966248 | 0.013773750513792038 |
| YoloLayer3    | 1.5002599954605103 | 1.1116538047790527  | 0.36262497305870056 | 0.025981156155467033 |
+---------------+--------------------+---------------------+---------------------+----------------------+
Total Loss: 2.669099, Runtime: 404.888372

Tensorboard

If you would like to use tensorboard for tracking traing process.

  • Open additional terminal in the same folder where you are running program.
  • Run command $ tensorboard --logdir='weights/your_model_name/logs' --port=6006
  • Go to http://localhost:6006/

Results

UCAS_AOD

Method Plane Car mAP
YOLOv4 (smoothL1-iou) 98.05 92.05 95.05

car

plane

DOTA

DOTA have not been tested yet. (It's quite difficult to test because of large resolution of images) DOTADOTA

trash (custom dataset)

Method Plane Car mAP
YOLOv4 (smoothL1-iou) 100.00 100.00 100.00

garbage1

garbage2

TODO

  • Mosaic Augmentation
  • Mixup Augmentation

References

yangxue0827/RotationDetection
eriklindernoren/PyTorch-YOLOv3
Tianxiaomo/pytorch-YOLOv4
ultralytics/yolov5

YOLOv4: Optimal Speed and Accuracy of Object Detection

Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao

Abstract There are a huge number of features which are said to improve Convolutional Neural Network (CNN) accuracy. Practical testing of combinations of such features on large datasets, and theoretical justification of the result, is required. Some features operate on certain models exclusively and for certain problems exclusively, or only for small-scale datasets; while some features, such as batch-normalization and residual-connections, are applicable to the majority of models, tasks, and datasets...

@article{yolov4,
  title={YOLOv4: Optimal Speed and Accuracy of Object Detection},
  author={Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao},
  journal = {arXiv},
  year={2020}
}

R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object

Xue Yang, Junchi Yan, Ziming Feng, Tao He

Abstract Rotation detection is a challenging task due to the difficulties of locating the multi-angle objects and separating them effectively from the background. Though considerable progress has been made, for practical settings, there still exist challenges for rotating objects with large aspect ratio, dense distribution and category extremely imbalance. In this paper, we propose an end-to-end refined single-stage rotation detector for fast and accurate object detection by using a progressive regression approach from coarse to fine granularity...

@article{r3det,
  title={R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object},
  author={Xue Yang, Junchi Yan, Ziming Feng, Tao He},
  journal = {arXiv},
  year={2019}
}
Continual reinforcement learning baselines: experiment specifications, implementation of existing methods, and common metrics. Easily extensible to new methods.

Continual Reinforcement Learning This repository provides a simple way to run continual reinforcement learning experiments in PyTorch, including evalu

55 Dec 24, 2022
Spiking Neural Network for Computer Vision using SpikingJelly framework and Pytorch-Lightning

Spiking Neural Network for Computer Vision using SpikingJelly framework and Pytorch-Lightning

Sami BARCHID 2 Oct 20, 2022
Code repository for Semantic Terrain Classification for Off-Road Autonomous Driving

BEVNet Datasets Datasets should be put inside data/. For example, data/semantic_kitti_4class_100x100. Training BEVNet-S Example: cd experiments bash t

(Brian) JoonHo Lee 24 Dec 12, 2022
Revealing and Protecting Labels in Distributed Training

Revealing and Protecting Labels in Distributed Training

Google Interns 0 Nov 09, 2022
Tensorflow 2 implementation of our high quality frame interpolation neural network

FILM: Frame Interpolation for Large Scene Motion Project | Paper | YouTube | Benchmark Scores Tensorflow 2 implementation of our high quality frame in

Google Research 1.6k Dec 28, 2022
Cold Brew: Distilling Graph Node Representations with Incomplete or Missing Neighborhoods

Cold Brew: Distilling Graph Node Representations with Incomplete or Missing Neighborhoods Introduction Graph Neural Networks (GNNs) have demonstrated

37 Dec 15, 2022
🚗 INGI Dakar 2K21 - Be the first one on the finish line ! 🚗

🚗 INGI Dakar 2K21 - Be the first one on the finish line ! 🚗 This year's first semester Club Info challenge will put you at the head of a car racing

ClubINFO INGI (UCLouvain) 6 Dec 10, 2021
Header-only library for using Keras models in C++.

frugally-deep Use Keras models in C++ with ease Table of contents Introduction Usage Performance Requirements and Installation FAQ Introduction Would

Tobias Hermann 927 Jan 05, 2023
🌎 The Modern Declarative Data Flow Framework for the AI Empowered Generation.

🌎 JSONClasses JSONClasses is a declarative data flow pipeline and data graph framework. Official Website: https://www.jsonclasses.com Official Docume

Fillmula Inc. 53 Dec 09, 2022
An experiment to bait a generalized frontrunning MEV bot

Honeypot 🍯 A simple experiment that: Creates a honeypot contract Baits a generalized fronturnning bot with a unique transaction Analyze bot behaviour

0x1355 14 Nov 24, 2022
🐦 Quickly annotate data from the comfort of your Jupyter notebook

🐦 pigeon - Quickly annotate data on Jupyter Pigeon is a simple widget that lets you quickly annotate a dataset of unlabeled examples from the comfort

Anastasis Germanidis 647 Jan 05, 2023
This is a package for LiDARTag, described in paper: LiDARTag: A Real-Time Fiducial Tag System for Point Clouds

LiDARTag Overview This is a package for LiDARTag, described in paper: LiDARTag: A Real-Time Fiducial Tag System for Point Clouds (PDF)(arXiv). This wo

University of Michigan Dynamic Legged Locomotion Robotics Lab 159 Dec 21, 2022
Official implementation for NIPS'17 paper: PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs.

PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning The predictive learning of spatiotemporal sequences aims to generate future

THUML: Machine Learning Group @ THSS 243 Dec 26, 2022
WatermarkRemoval-WDNet-WACV2021

WatermarkRemoval-WDNet-WACV2021 Thank you for your attention. Citation Please cite the related works in your publications if it helps your research: @

LUYI 63 Dec 05, 2022
Supervised 3D Pre-training on Large-scale 2D Natural Image Datasets for 3D Medical Image Analysis

Introduction This is an implementation of our paper Supervised 3D Pre-training on Large-scale 2D Natural Image Datasets for 3D Medical Image Analysis.

24 Dec 06, 2022
Machine learning framework for both deep learning and traditional algorithms

NeoML is an end-to-end machine learning framework that allows you to build, train, and deploy ML models. This framework is used by ABBYY engineers for

NeoML 704 Dec 27, 2022
Continuous Time LiDAR odometry

CT-ICP: Elastic SLAM for LiDAR sensors This repository implements the SLAM CT-ICP (see our article), a lightweight, precise and versatile pure LiDAR o

385 Dec 29, 2022
Hierarchical Aggregation for 3D Instance Segmentation (ICCV 2021)

HAIS Hierarchical Aggregation for 3D Instance Segmentation (ICCV 2021) by Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang*. (*) Corresp

Hust Visual Learning Team 145 Jan 05, 2023
Stochastic Tensor Optimization for Robot Motion - A GPU Robot Motion Toolkit

STORM Stochastic Tensor Optimization for Robot Motion - A GPU Robot Motion Toolkit [Install Instructions] [Paper] [Website] This package contains code

NVIDIA Research Projects 101 Dec 12, 2022
AAI supports interdisciplinary research to help better understand human, animal, and artificial cognition.

AnimalAI 3 AAI supports interdisciplinary research to help better understand human, animal, and artificial cognition. It aims to support AI research t

Matthew Crosby 58 Dec 12, 2022