CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

Last update: Dec 12, 2022

Related tags

Overview

CFC-Net

This project hosts the official implementation for the paper:

CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images [arxiv]

(paper and the complete code are coming soon....)

Abstract

In this paper, we discuss the role of discriminative features in object detection, and then propose a Critical Feature Capturing Network (CFC-Net) to improve detection accuracy from three aspects: building powerful feature representation, refining preset anchors, and optimizing label assignment. The proposed framework creates more powerful semantic representations for objects in remote sensing images and achieves high-performance real-time object detection. Note that our model is a one-stage detector with only one anchor on each location in feature maps, which is equivalent to the anchor-free methods, thus the inference speed is faster.

Requirements

torch >= 1.1
CUDA version >=10.0

Installation

pip install -r requirements.txt
pip install git+git://github.com/lehduong/torch-warmup-lr.git

cd $ROOT/utils
sh make.sh

cd $ROOT/datasets/DOTA_devkit
sudo apt-get install swig
swig -c++ -python polyiou.i
python setup.py build_ext --inplace

Training

Move the dataset to the $ROOT directory.
Generate imageset files for dataset division via:

cd $ROOT/datasets
python generate_imageset.py

Modify the configuration file hyp.py and arguments in train.py, then start training:

python train.py

Inference

You can use the following command to test a dataset. Note that weight, img_dir, dataset,hyp should be modified as appropriate.

python demo.py

Evaluation

Different datasets use different test methods. For UCAS-AOD/HRSC2016/VOC/NWPU VHR-10, you need to prepare labels in the appropriate format in advance. Take evaluation on HRSC2016 for example:

cd $ROOT/datasets/evaluate
python hrsc2gt.py

then you can conduct evaluation:

python eval.py

Note that :

the script needs to be executed only once, but testing on different datasets needs to be executed again.
the imageset file used in hrsc2gt.py is generated from generate_imageset.py.

Main Results

Method	Dataset	Backbone	Input Size	mAP
CFC-Net	HRSC2016	ResNet-50	416 x 416	86.3
CFC-Net	HRSC2016	ResNet-101	800 x 800	89.7
CFC-Net	UCAS-AOD	ResNet-50	416 x 416	89.5
CFC-Net	DOTA	ResNet-101	800 x 800	73.5

Detections

Results on HRSC2016: the red bounding box and the green denotes preset anchors and detection results, respectively.
Results on DOTA:

Citation

If you find our work or code useful in your research, please consider citing:

@article{ming2021cfc,
  title={CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images},
  author={Ming, Qi and Miao, Lingjuan and Zhou, Zhiqiang and Dong, Yunpeng},
  journal={arXiv preprint arXiv:2101.06849},
  year={2021}
}

If you have any questions, please contact me via issue or email.

CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

Related tags

Overview

CFC-Net

Abstract

Requirements

Installation

Training

Inference

Evaluation

Main Results

Detections

Citation

Owner

ming71

Geometric Vector Perceptron --- a rotation-equivariant GNN for learning from biomolecular structure

An official implementation of MobileStyleGAN in PyTorch

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

This is the code for our KILT leaderboard submission to the T-REx and zsRE tasks. It includes code for training a DPR model then continuing training with RAG.

Official codes: Self-Supervised Learning by Estimating Twin Class Distribution

Multivariate Time Series Transformer, public version

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

Global-Local Context Network for Person Search

Unofficial pytorch implementation of paper "One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing"

Invasive Plant Species Identification

Gym for multi-agent reinforcement learning

Do Smart Glasses Dream of Sentimental Visions? Deep Emotionship Analysis for Eyewear Devices

NovelD: A Simple yet Effective Exploration Criterion

Semantic Segmentation of images using PixelLib with help of Pascalvoc dataset trained with Deeplabv3+ framework.

Collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and related datasets

Pytorch implementation of CoCon: A Self-Supervised Approach for Controlled Text Generation

Library to enable Bayesian active learning in your research or labeling work.

A Python training and inference implementation of Yolov5 helmet detection in Jetson Xavier nx and Jetson nano

Facial expression detector

On Effective Scheduling of Model-based Reinforcement Learning