FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

Overview

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection

FCOSR: A Simple Anchor-free Rotated Detector for Aerial Object Detection
arXiv preprint (arXiv:2111.10780).

This implement is modified from mmdetection. We also refer to the codes of ReDet, PIoU, and ProbIoU.

In the process of implementation, we find that only Python code processing will produce huge memory overhead on Nvidia devices. Therefore, we directly write the label assignment module proposed in this paper in the form of CUDA extension of Pytorch. The program could not work effectively when we migrate it to cuda 11 (only support cuda10). By applying CUDA expansion, the memory utilization is improved and a lot of unnecessary calculations are reduced. We also try to train FCOSR-M on 2080ti (4 images per device), which can basically fill memory of graphics card.

FCOSR TensorRT inference code is available at: https://github.com/lzh420202/TensorRT_Inference

We add a multiprocess version DOTA2COCO into DOTA_devkit package, you could switch USE_MULTI_PROCESS to control the function in prepare_dota.py

Install

Please refer to install.md for installation and dataset preparation.

Getting Started

Please see get_started.md for the basic usage.

Model Zoo

Speed vs Accuracy on DOTA 1.0 test set

benchmark

Details (Test device: nvidia RTX 2080ti)

Methods backbone FPS mAP(%)
ReDet ReR50 8.8 76.25
S2ANet Mobilenet v2 18.9 67.46
S2ANet R50 14.4 74.14
R3Det R50 9.2 71.9
Oriented-RCNN Mobilenet v2 21.2 72.72
Oriented-RCNN R50 13.8 75.87
Oriented-RCNN R101 11.3 76.28
RetinaNet-O Mobilenet v2 22.4 67.95
RetinaNet-O R50 16.5 72.7
RetinaNet-O R101 13.3 73.7
Faster-RCNN-O Mobilenet v2 23 67.41
Faster-RCNN-O R50 14.4 72.29
Faster-RCNN-O R101 11.4 72.65
FCOSR-S Mobilenet v2 23.7 74.05
FCOSR-M Rx50 14.6 77.15
FCOSR-L Rx101 7.9 77.39

The password of baiduPan is ABCD

FCOSR serise DOTA 1.0 result.FPS(2080ti) Detail

Model backbone MS Sched. Param. Input GFLOPs FPS mAP download
FCOSR-S Mobilenet v2 - 3x 7.32M 1024×1024 101.42 23.7 74.05 model/cfg
FCOSR-S Mobilenet v2 3x 7.32M 1024×1024 101.42 23.7 76.11 model/cfg
FCOSR-M ResNext50-32x4 - 3x 31.4M 1024×1024 210.01 14.6 77.15 model/cfg
FCOSR-M ResNext50-32x4 3x 31.4M 1024×1024 210.01 14.6 79.25 model/cfg
FCOSR-L ResNext101-64x4 - 3x 89.64M 1024×1024 445.75 7.9 77.39 model/cfg
FCOSR-L ResNext101-64x4 3x 89.64M 1024×1024 445.75 7.9 78.80 model/cfg

FCOSR serise DOTA 1.5 result. FPS(2080ti) Detail

Model backbone MS Sched. Param. Input GFLOPs FPS mAP download
FCOSR-S Mobilenet v2 - 3x 7.32M 1024×1024 101.42 23.7 66.37 model/cfg
FCOSR-S Mobilenet v2 3x 7.32M 1024×1024 101.42 23.7 73.14 model/cfg
FCOSR-M ResNext50-32x4 - 3x 31.4M 1024×1024 210.01 14.6 68.74 model/cfg
FCOSR-M ResNext50-32x4 3x 31.4M 1024×1024 210.01 14.6 73.79 model/cfg
FCOSR-L ResNext101-64x4 - 3x 89.64M 1024×1024 445.75 7.9 69.96 model/cfg
FCOSR-L ResNext101-64x4 3x 89.64M 1024×1024 445.75 7.9 75.41 model/cfg

FCOSR serise HRSC2016 result. FPS(2080ti)

Model backbone Rot. Sched. Param. Input GFLOPs FPS AP50(07) AP75(07) AP50(12) AP75(12) download
FCOSR-S Mobilenet v2 40k iters 7.29M 800×800 61.57 35.3 90.08 76.75 92.67 75.73 model/cfg
FCOSR-M ResNext50-32x4 40k iters 31.37M 800×800 127.87 26.9 90.15 78.58 94.84 81.38 model/cfg
FCOSR-L ResNext101-64x4 40k iters 89.61M 800×800 271.75 15.1 90.14 77.98 95.74 80.94 model/cfg

Lightweight FCOSR test result on Jetson Xavier NX (DOTA 1.0 single-scale). Detail

Model backbone Head channels Sched. Param Size Input GFLOPs FPS mAP onnx TensorRT
FCOSR-lite Mobilenet v2 256 3x 6.9M 51.63MB 1024×1024 101.25 7.64 74.30 onnx trt
FCOSR-tiny Mobilenet v2 128 3x 3.52M 23.2MB 1024×1024 35.89 10.68 73.93 onnx trt

Lightweight FCOSR test result on Jetson AGX Xavier (DOTA 1.0 single-scale).

A part of Dota1.0 dataset (whole image mode) Code

name size patch size gap patches det objects det time(s)
P0031.png 5343×3795 1024 200 35 1197 2.75
P0051.png 4672×5430 1024 200 42 309 2.38
P0112.png 6989×4516 1024 200 54 184 3.02
P0137.png 5276×4308 1024 200 35 66 1.95
P1004.png 7001×3907 1024 200 45 183 2.52
P1125.png 7582×4333 1024 200 54 28 2.95
P1129.png 4093×6529 1024 200 40 70 2.23
P1146.png 5231×4616 1024 200 42 64 2.29
P1157.png 7278×5286 1024 200 63 184 3.47
P1378.png 5445×4561 1024 200 42 83 2.32
P1379.png 4426×4182 1024 200 30 686 1.78
P1393.png 6072×6540 1024 200 64 893 3.63
P1400.png 6471×4479 1024 200 48 348 2.63
P1402.png 4112×4793 1024 200 30 293 1.68
P1406.png 6531×4182 1024 200 40 19 2.19
P1415.png 4894x4898 1024 200 36 190 1.99
P1436.png 5136×5156 1024 200 42 39 2.31
P1448.png 7242×5678 1024 200 63 51 3.41
P1457.png 5193×4658 1024 200 42 382 2.33
P1461.png 6661×6308 1024 200 64 27 3.45
P1494.png 4782×6677 1024 200 48 70 2.61
P1500.png 4769×4386 1024 200 36 92 1.96
P1772.png 5963×5553 1024 200 49 28 2.70
P1774.png 5352×4281 1024 200 35 291 1.95
P1796.png 5870×5822 1024 200 49 308 2.74
P1870.png 5942×6059 1024 200 56 135 3.04
P2043.png 4165×3438 1024 200 20 1479 1.49
P2329.png 7950×4334 1024 200 60 83 3.26
P2641.png 7574×5625 1024 200 63 269 3.41
P2642.png 7039×5551 1024 200 63 451 3.50
P2643.png 7568×5619 1024 200 63 249 3.40
P2645.png 4605×3442 1024 200 24 357 1.42
P2762.png 8074×4359 1024 200 60 127 3.23
P2795.png 4495×3981 1024 200 30 65 1.64
Simple and ready-to-use tutorials for TensorFlow

TensorFlow World To support maintaining and upgrading this project, please kindly consider Sponsoring the project developer. Any level of support is a

Amirsina Torfi 4.5k Dec 23, 2022
Bottleneck Transformers for Visual Recognition

Bottleneck Transformers for Visual Recognition Experiments Model Params (M) Acc (%) ResNet50 baseline (ref) 23.5M 93.62 BoTNet-50 18.8M 95.11% BoTNet-

Myeongjun Kim 236 Jan 03, 2023
source code and pre-trained/fine-tuned checkpoint for NAACL 2021 paper LightningDOT

LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval This repository contains source code and pre-trained/fine-tun

Siqi 65 Dec 26, 2022
A Comparative Framework for Multimodal Recommender Systems

Cornac Cornac is a comparative framework for multimodal recommender systems. It focuses on making it convenient to work with models leveraging auxilia

Preferred.AI 671 Jan 03, 2023
SHIFT15M: multiobjective large-scale fashion dataset with distributional shifts

[arXiv] The main motivation of the SHIFT15M project is to provide a dataset that contains natural dataset shifts collected from a web service IQON, wh

ZOZO, Inc. 138 Nov 24, 2022
Blender Python - Node-based multi-line text and image flowchart

MindMapper v0.8 Node-based text and image flowchart for Blender Mindmap with shortcuts visible: Mindmap with shortcuts hidden: Notes This was requeste

SpectralVectors 58 Oct 08, 2022
Zero-shot Learning by Generating Task-specific Adapters

Code for "Zero-shot Learning by Generating Task-specific Adapters" This is the repository containing code for "Zero-shot Learning by Generating Task-s

INK Lab @ USC 11 Dec 17, 2021
An all-in-one application to visualize multiple different local path planning algorithms

Table of Contents Table of Contents Local Planner Visualization Project (LPVP) Features Installation/Usage Local Planners Probabilistic Roadmap (PRM)

Abdur Javaid 47 Dec 30, 2022
PyMove is a Python library to simplify queries and visualization of trajectories and other spatial-temporal data

Use PyMove and go much further Information Package Status License Python Version Platforms Build Status PyPi version PyPi Downloads Conda version Cond

Insight Data Science Lab 64 Nov 15, 2022
Vector Neurons: A General Framework for SO(3)-Equivariant Networks

Vector Neurons: A General Framework for SO(3)-Equivariant Networks Created by Congyue Deng, Or Litany, Yueqi Duan, Adrien Poulenard, Andrea Tagliasacc

Congyue Deng 332 Dec 29, 2022
Rotary Transformer

[中文|English] Rotary Transformer Rotary Transformer is an MLM pre-trained language model with rotary position embedding (RoPE). The RoPE is a relative

325 Jan 03, 2023
Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening

Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening Introduction This is an implementation of the model used for breast

757 Dec 30, 2022
[peer review] An Arbitrary Scale Super-Resolution Approach for 3D MR Images using Implicit Neural Representation

ArSSR This repository is the pytorch implementation of our manuscript "An Arbitrary Scale Super-Resolution Approach for 3-Dimensional Magnetic Resonan

Qing Wu 19 Dec 12, 2022
Level Based Customer Segmentation

level_based_customer_segmentation Level Based Customer Segmentation Persona Veri Seti kullanılarak müşteri segmentasyonu yapılmıştır. KOLONLAR : PRICE

Buse Yıldırım 6 Dec 21, 2021
A Robust Unsupervised Ensemble of Feature-Based Explanations using Restricted Boltzmann Machines

A Robust Unsupervised Ensemble of Feature-Based Explanations using Restricted Boltzmann Machines Understanding the results of deep neural networks is

Johan van den Heuvel 2 Dec 13, 2021
Training Very Deep Neural Networks Without Skip-Connections

DiracNets v2 update (January 2018): The code was updated for DiracNets-v2 in which we removed NCReLU by adding per-channel a and b multipliers without

Sergey Zagoruyko 585 Oct 12, 2022
Kinetics-Data-Preprocessing

Kinetics-Data-Preprocessing Kinetics-400 and Kinetics-600 are common video recognition datasets used by popular video understanding projects like Slow

Kaihua Tang 7 Oct 27, 2022
Deep learning PyTorch library for time series forecasting, classification, and anomaly detection

Deep learning for time series forecasting Flow forecast is an open-source deep learning for time series forecasting framework. It provides all the lat

AIStream 1.2k Jan 04, 2023
Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images"

GANInversion_with_ConsecutiveImgs Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images" https://a

QingyangXu 38 Dec 07, 2022
Angular & Electron desktop UI framework. Angular components for native looking and behaving macOS desktop UI (Electron/Web)

Angular Desktop UI This is a collection for native desktop like user interface components in Angular, especially useful for Electron apps. It starts w

Marc J. Schmidt 49 Dec 22, 2022