Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

Last update: Dec 23, 2022

Overview

ReDet: A Rotation-equivariant Detector for Aerial Object Detection

ReDet: A Rotation-equivariant Detector for Aerial Object Detection (CVPR2021),
Jiaming Han^*, Jian Ding^*, Nan Xue, Gui-Song Xia^†,
arXiv preprint (arXiv:2103.07733).

The repo is based on AerialDetection and mmdetection. AerialDetection is a powerful framework for object detection in aerial images, which contains a lot of useful algorithms and tools.

Introduction

Recently, object detection in aerial images has gained much attention in computer vision. Different from objects in natural images, aerial objects are often distributed with arbitrary orientation. Therefore, the detector requires more parameters to encode the orientation information, which are often highly redundant and inefficient. Moreover, as ordinary CNNs do not explicitly model the orientation variation, large amounts of rotation augmented data is needed to train an accurate object detector. In this paper, we propose a Rotation-equivariant Detector (ReDet) to address these issues, which explicitly encodes rotation equivariance and rotation invariance. More precisely, we incorporate rotation-equivariant networks into the detector to extract rotation-equivariant features, which can accurately predict the orientation and lead to a huge reduction of model size. Based on the rotation-equivariant features, we also present Rotation-invariant RoI Align (RiRoI Align), which adaptively extracts rotation-invariant features from equivariant features according to the orientation of RoI. Extensive experiments on several challenging aerial image datasets DOTA-v1.0, DOTA-v1.5 and HRSC2016, show that our method can achieve state-of-the-art performance on the task of aerial object detection. Compared with previous best results, our ReDet gains 1.2, 3.5 and 2.6 mAP on DOTA-v1.0, DOTA-v1.5 and HRSC2016 respectively while reducing the number of parameters by 60% (313 Mb vs. 121 Mb).

Changelog

2021-03-09. Code released.

Benchmark and model zoo

ImageNet pretrain

We pretrain our ReResNet on the ImageNet-1K. Related codes can be found at the ReDet_mmcls branch. Here we provide our pretrained ReResNet-50 model for convenience. If you want to train and use ReResNet in your own project, please check out ReDet_mmcls for the installation and basic usage.

Model	Group	Top-1 (%)	Top-5 (%)	Download
ReR50	C₈	71.20	90.28	model \| log

Object Detection

Model	Data	Backbone	MS	Rotate	Lr schd	box AP	Download
ReDet	DOTA-v1.0	ReR50-FPN	-	-	1x	76.25	cfg model log
ReDet	DOTA-v1.0	ReR50-FPN	✓	✓	1x	80.10	cfg model log
ReDet	DOTA-v1.5	ReR50-FPN	-	-	1x	66.86	cfg model log
ReDet	DOTA-v1.5	ReR50-FPN	✓	✓	1x	76.80	cfg model log
ReDet	HRSC2016	ReR50-FPN	-	-	3x	90.46	cfg model log

If you cannot get access to Google Drive, BaiduYun download link can be found here with extracting code ABCD.

Installation

Please refer to INSTALL.md for installation and dataset preparation.

Getting Started

Please see GETTING_STARTED.md for the basic usage.

Citation

@inproceedings{han2021ReDet,
  author = {Han, Jiaming and Ding, Jian and Xue, Nan and Xia, Gui-Song},
  title = {ReDet: A Rotation-equivariant Detector for Aerial Object Detection},
  booktitle = {Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)},
  year = {2021}
}

Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

Related tags

Overview

ReDet: A Rotation-equivariant Detector for Aerial Object Detection

Introduction

Changelog

Benchmark and model zoo

Installation

Getting Started

Citation

Owner

csuhan

Implementation of ICCV2021(Oral) paper - VMNet: Voxel-Mesh Network for Geodesic-aware 3D Semantic Segmentation

A set of tools to pre-calibrate and calibrate (multi-focus) plenoptic cameras (e.g., a Raytrix R12) based on the libpleno.

Semi-supervised Adversarial Learning to Generate Photorealistic Face Images of New Identities from 3D Morphable Model

Flow is a computational framework for deep RL and control experiments for traffic microsimulation.

LF-YOLO (Lighter and Faster YOLO) is used to detect defect of X-ray weld image.

This package implements THOR: Transformer with Stochastic Experts.

Wav2Vec for speech recognition, classification, and audio classification

Using OpenAI's CLIP to upscale and enhance images

Official PyTorch Implementation of Mask-aware IoU and maYOLACT Detector [BMVC2021]

Official PyTorch implementation of UACANet: Uncertainty Aware Context Attention for Polyp Segmentation

Official code for "Towards An End-to-End Framework for Flow-Guided Video Inpainting" (CVPR2022)

Safe Model-Based Reinforcement Learning using Robust Control Barrier Functions

Deep learning for spiking neural networks

CVPR2021: Temporal Context Aggregation Network for Temporal Action Proposal Refinement

The PyTorch re-implement of a 3D CNN Tracker to extract coronary artery centerlines with state-of-the-art (SOTA) performance. (paper: 'Coronary artery centerline extraction in cardiac CT angiography using a CNN-based orientation classiﬁer')

data/code repository of "C2F-FWN: Coarse-to-Fine Flow Warping Network for Spatial-Temporal Consistent Motion Transfer"

ScaleNet: A Shallow Architecture for Scale Estimation

POPPY (Physical Optics Propagation in Python) is a Python package that simulates physical optical propagation including diffraction

Explaining Hyperparameter Optimization via PDPs

EMNLP'2021: SimCSE: Simple Contrastive Learning of Sentence Embeddings