Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

Last update: Dec 29, 2022

Related tags

Computer Vision SA-AutoAug

Overview

SA-AutoAug

Scale-aware Automatic Augmentation for Object Detection

Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia

[Paper] [BibTeX]

This project provides the implementation for the CVPR 2021 paper "Scale-aware Automatic Augmentation for Object Detection". Scale-aware AutoAug provides a new search space and search metric to find effective data agumentation policies for object detection. It is implemented on maskrcnn-benchmark and FCOS. Both search and training codes have been released. To facilitate more use, we re-implement the training code based on Detectron2.

Installation

For maskrcnn-benchmark code, please follow INSTALL.md for instruction.

For FCOS code, please follow INSTALL.md for instruction.

For Detectron2 code, please follow INSTALL.md for instruction.

Search

(You can skip this step and directly train on our searched policies.)

To search with 8 GPUs, run:

cd /path/to/SA-AutoAug/maskrcnn-benchmark
export NGPUS=8
python3 -m torch.distributed.launch --nproc_per_node=$NGPUS tools/search.py --config-file configs/SA_AutoAug/retinanet_R-50-FPN_search.yaml OURPUT_DIR /path/to/searchlog_dir

Since we finetune on an existing baseline model during search, a baseline model is needed. You can download this model for search, or you can use other Retinanet baseline model trained by yourself.

Training

To train the searched policies on maskrcnn-benchmark (FCOS)

cd /path/to/SA-AutoAug/maskrcnn-benchmark
export NGPUS=8
python3 -m torch.distributed.launch --nproc_per_node=$NGPUS tools/train_net.py --config-file configs/SA_AutoAug/CONFIG_FILE  OUTPUT_DIR /path/to/traininglog_dir

For example, to train the retinanet ResNet-50 model with our searched data augmentation policies in 6x schedule:

cd /path/to/SA-AutoAug/maskrcnn-benchmark
export NGPUS=8
python3 -m torch.distributed.launch --nproc_per_node=$NGPUS tools/train_net.py --config-file configs/SA_AutoAug/retinanet_R-50-FPN_6x.yaml  OUTPUT_DIR models/retinanet_R-50-FPN_6x_SAAutoAug

To train the searched policies on detectron2

cd /path/to/SA-AutoAug/detectron2
python3 ./tools/train_net.py --num-gpus 8 --config-file ./configs/COCO-Detection/SA_AutoAug/CONFIG_FILE OUTPUT_DIR /path/to/traininglog_dir

For example, to train the retinanet ResNet-50 model with our searched data augmentation policies in 6x schedule:

cd /path/to/SA-AutoAug/detectron2
python3 ./tools/train_net.py --num-gpus 8 --config-file ./configs/COCO-Detection/SA_AutoAug/retinanet_R_50_FPN_6x.yaml OUTPUT_DIR output_retinanet_R_50_FPN_6x_SAAutoAug

Results

We provide the results on COCO val2017 set with pretrained models.

Based on maskrcnn-benchmark

Method	Backbone	AP_bbox	Download
Faster R-CNN	ResNet-50	41.8	Model
Faster R-CNN	ResNet-101	44.2	Model
RetinaNet	ResNet-50	41.4	Model
RetinaNet	ResNet-101	42.8	Model
Mask R-CNN	ResNet-50	42.8	Model
Mask R-CNN	ResNet-101	45.3	Model

Based on FCOS

Method	Backbone	AP_bbox	Download
FCOS	ResNet-50	42.6	Model
FCOS	ResNet-101	44.0	Model
ATSS	ResNext-101-32x8d-dcnv2	48.5	Model
ATSS	ResNext-101-32x8d-dcnv2 (1200 size)	49.6	Model

Based on Detectron2

Method	Backbone	AP_bbox	Download
Faster R-CNN	ResNet-50	41.9	Model - Metrics
Faster R-CNN	ResNet-101	44.2	Model - Metrics
RetinaNet	ResNet-50	40.8	Model - Metrics
RetinaNet	ResNet-101	43.1	Model - Metrics
Mask R-CNN	ResNet-50	-	Training
Mask R-CNN	ResNet-101	-	Training

Citing SA-AutoAug

Consider cite SA-Autoaug in your publications if it helps your research.

@inproceedings{saautoaug,
  title={Scale-aware Automatic Augmentation for Object Detection},
  author={Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

Acknowledgments

This training code of this project is built on maskrcnn-benchmark, Detectron2, FCOS, and ATSS. The search code of this project is modified from DetNAS. Some augmentation code and settings follow AutoAug-Det. We thanks a lot for the authors of these projects.

Note that:

(1) We also provides script files for search and training in maskrcnn-benchmark, FCOS, and, detectron2.

(2) Any issues or pull requests on this project are welcome. In addition, if you meet problems when applying the augmentations to other datasets or codebase, feel free to contact Yukang Chen ([email protected]).

Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

Related tags

Overview

SA-AutoAug

Installation

Search

Training

Results

Based on maskrcnn-benchmark

Based on FCOS

Based on Detectron2

Citing SA-AutoAug

Acknowledgments

Owner

Jia Research Lab

textspotter - An End-to-End TextSpotter with Explicit Alignment and Attention

Python library to extract tabular data from images and scanned PDFs

It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework (CVPR 2021 oral)

Source Code for AAAI 2022 paper "Graph Convolutional Networks with Dual Message Passing for Subgraph Isomorphism Counting and Matching"

Msos searcher - A half-hearted attempt at finding a magic square of squares

docstrum

Web interface for browsing arXiv papers

An expandable and scalable OCR pipeline

M-LSDを用いて四角形を検出し、射影変換を行うサンプルプログラム

Ackermann Line Follower Robot Simulation.

An Implementation of the alogrithm in paper IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection

Educational application aimed at automating user-defined workflows for the mobile game, "Granblue Fantasy", using a variety of CV technologies in the backend such as OpenCV, PyAutoGUI and EasyOCR and a frontend coded in Typescript.

Solution for Problem 1 by team codesquad for AIDL 2020. Uses ML Kit for OCR and OpenCV for image processing

With the virtual keyboard, you can write on the real time images by combining the thumb and index fingers on the letter you want.

This is a repository to learn and get more computer vision skills, make robotics projects integrating the computer vision as a perception tool and create a lot of awesome advanced controllers for the robots of the future.

Um RPG de texto orientado a objetos.

Contextual speed detection for python

A machine learning software for extracting information from scholarly documents

Discord QR Scam Code Generator + Token grab mobile device.