Oriented Object Detection: Oriented RepPoints + Swin Transformer/ReResNet

Last update: Dec 13, 2022

Overview

Oriented RepPoints for Aerial Object Detection

The code for the implementation of “Oriented RepPoints + Swin Transformer/ReResNet”.

Introduction

Based on the Oriented Reppoints detector with Swin Transformer backbone, the 3rd Place is achieved on the Task 1 and the 2nd Place is achieved on the Task 2 of 2021 challenge of Learning to Understand Aerial Images (LUAI) held on ICCV’2021. The detailed information is introduced in this paper of "LUAI Challenge 2021 on Learning to Understand Aerial Images, ICCVW2021".

New Feature

BackBone: add Swin-Transformer, ReResNet
DataAug: add Mosaic4or9, Mixup, HSV, RandomPerspective, RandomScaleCrop

Installation

Please refer to for installation and dataset preparation.

Getting Started

This repo is based on . Please see for the basic usage.

Results and Models

The results on DOTA test-dev set are shown in the table below(password:aabb/swin/ABCD). More detailed results please see the paper.

Model	Backbone	MS	DataAug	DOTAv1 mAP	DOTAv2 mAP	Download
OrientedReppoints	R-50	-	-	75.68	-	baidu(aabb)
OrientedReppoints	R-101	-	√	76.21	-	baidu(aabb)
OrientedReppoints	R-101	√	√	78.12	-	baidu(aabb)
OrientedReppoints	SwinT-tiny	-	√	-	-	-

ImageNet-1K and ImageNet-22K Pretrained Models

name	pretrain	resolution	[email protected]	[email protected]	#params	FLOPs	FPS	22K model	1K model	Need to turn read version
Swin-T	ImageNet-1K	224x224	81.2	95.5	28M	4.5G	755	-	github/baidu(swin)/config	✔
Swin-S	ImageNet-1K	224x224	83.2	96.2	50M	8.7G	437	-	github/baidu(swin)/config	✔
Swin-B	ImageNet-1K	224x224	83.5	96.5	88M	15.4G	278	-	github/baidu(swin)/config	✔
Swin-B	ImageNet-1K	384x384	84.5	97.0	88M	47.1G	85	-	github/baidu(swin)/test-config	✔
Swin-B	ImageNet-22K	224x224	85.2	97.5	88M	15.4G	278	github/baidu(swin)	github/baidu(swin)/test-config	✔
Swin-B	ImageNet-22K	384x384	86.4	98.0	88M	47.1G	85	github/baidu(swin)	github/baidu(swin)/test-config	✔
Swin-L	ImageNet-22K	224x224	86.3	97.9	197M	34.5G	141	github/baidu(swin)	github/baidu(swin)/test-config	✔
Swin-L	ImageNet-22K	384x384	87.3	98.2	197M	103.9G	42	github/baidu(swin)	github/baidu(swin)/test-config	✔
ReResNet50	ImageNet-1K	224x224	71.20	90.28	-	-	-	-	google/baidu(ABCD)/log	-

The mAOE results on DOTAv1 val set are shown in the table below(password:aabb).

Model	Backbone	mAOE	Download
OrientedReppoints	R-50	5.93°	baidu(aabb)

Note：

Wtihout the ground-truth of test subset, the mAOE of orientation evaluation is calculated on the val subset(original train subset for training).
The orientation (angle) of an aerial object is define as below, the detail of mAOE, please see the paper. The code of mAOE is mAOE_evaluation.py.

Visual results

The visual results of learning points and the oriented bounding boxes. The visualization code is .

Learning points

Oriented bounding box

Citation

@article{Li2021oriented,
  title={Oriented RepPoints for Aerial Object Detection},
  author={Wentong Li and Jianke Zhu},
  journal={arXiv preprint arXiv:2105.11111},
  year={2021}
}

Acknowledgements

I have used utility functions from other wonderful open-source projects. Espeicially thank the authors of:

OrientedRepPoints

Swin-Transformer-Object-Detection

ReDet

Oriented Object Detection: Oriented RepPoints + Swin Transformer/ReResNet

Related tags

Overview

Oriented RepPoints for Aerial Object Detection

Introduction

New Feature

Installation

Getting Started

Results and Models

Visual results

Citation

Acknowledgements

Owner

The 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop.

A curated list of awesome papers for Semantic Retrieval (TOIS Accepted: Semantic Models for the First-stage Retrieval: A Comprehensive Review).

Keras + Hyperopt: A very simple wrapper for convenient hyperparameter optimization

Very simple NCHW and NHWC conversion tool for ONNX. Change to the specified input order for each and every input OP. Also, change the channel order of RGB and BGR. Simple Channel Converter for ONNX.

Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

Code in PyTorch for the convex combination linear IAF and the Householder Flow, J.M. Tomczak & M. Welling

Machine Learning Platform for Kubernetes

Implementation of TabTransformer, attention network for tabular data, in Pytorch

official implementation for the paper "Simplifying Graph Convolutional Networks"

Python calculations for the position of the sun and moon.

[ICML'21] Estimate the accuracy of the classifier in various environments through self-supervision

Direct design of biquad filter cascades with deep learning by sampling random polynomials.

RepVGG: Making VGG-style ConvNets Great Again

Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization

Understanding Convolutional Neural Networks from Theoretical Perspective via Volterra Convolution

[ICCV 2021] Official PyTorch implementation for Deep Relational Metric Learning.

Self-supervised learning (SSL) is a method of machine learning

Campsite Reservation Finder

Implementation of OpenAI paper with Simple Noise Scale on Fastai V2