PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection.

Last update: Dec 29, 2022

Related tags

Overview

Introduction

This repo contains the official PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection.

Updates!!

【2021/10/10】 We release the official PyTorch implementation of DeFRCN.
【2021/08/20】 We have uploaded our paper (long version with supplementary material) on arxiv, review it for more details.

Quick Start

1. Check Requirements

Linux with Python >= 3.6
PyTorch >= 1.6 & torchvision that matches the PyTorch version.
CUDA 10.1, 10.2
GCC >= 4.9

2. Build DeFRCN

Clone Code

git clone https://github.com/er-muyue/DeFRCN.git
cd DeFRCN

Create a virtual environment (optional)

virtualenv defrcn
cd /path/to/venv/defrcn
source ./bin/activate

Install PyTorch 1.6.0 with CUDA 10.1

pip3 install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html

Install Detectron2
```
python3 -m pip install detectron2==0.3 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.6/index.html
```
- If you use other version of PyTorch/CUDA, check the latest version of Detectron2 in this page: Detectron2.
- Sorry for that I don’t have enough time to test on more versions, if you run into problems with other versions, please let me know.

Install other requirements.

python3 -m pip install -r requirements.txt

3. Prepare Data and Weights

Data Preparation

We evaluate our models on two datasets for both FSOD and G-FSOD settings:

Dataset	Size	GoogleDrive	BaiduYun	Note
VOC2007	0.8G	download	download	-
VOC2012	3.5G	download	download	-
vocsplit	<1M	download	download	refer from TFA
COCO	~19G	-	-	download from offical
cocosplit	174M	download	download	refer from TFA

Unzip the downloaded data-source to datasets and put it into your project directory:

  ...
  datasets
    | -- coco (trainval2014/*.jpg, val2014/*.jpg, annotations/*.json)
    | -- cocosplit
    | -- VOC2007
    | -- VOC2012
    | -- vocsplit
  defrcn
  tools
  ...

Weights Preparation
- We use the imagenet pretrain weights to initialize our model. Download the same models from here: GoogleDrive BaiduYun
- The extract code for all BaiduYun link is 0000

4. Training and Evaluation

For ease of training and evaluation over multiple runs, we integrate the whole pipeline of few-shot object detection into one script run_*.sh, including base pre-training and novel-finetuning (both FSOD and G-FSOD).

To reproduce the results on VOC, EXP_NAME can be any string (e.g defrcn, or something) and SPLIT_ID must be 1 or 2 or 3 (we consider 3 random splits like other papers).
```
bash run_voc.sh EXP_NAME SPLIT_ID (1, 2 or 3)
```
To reproduce the results on COCO, EXP_NAME can be any string (e.g defrcn, or something)
```
bash run_coco.sh EXP_NAME
```
Please read the details of few-shot object detection pipeline in run_*.sh, you need change IMAGENET_PRETRAIN* to your path.

Results on COCO Benchmark

Few-shot Object Detection

Method			mAP^novel
Shot	1	2	3	5	10	30
FRCN-ft	1.0*	1.8*	2.8*	4.0*	6.5	11.1
FSRW	-	-	-	-	5.6	9.1
MetaDet	-	-	-	-	7.1	11.3
MetaR-CNN	-	-	-	-	8.7	12.4
TFA	4.4*	5.4*	6.0*	7.7*	10.0	13.7
MPSR	5.1*	6.7*	7.4*	8.7*	9.8	14.1
FSDetView	4.5	6.6	7.2	10.7	12.5	14.7
DeFRCN (Our Paper)	9.3	12.9	14.8	16.1	18.5	22.6
DeFRCN (This Repo)	9.7	13.1	14.5	15.6	18.4	22.6

Generalized Few-shot Object Detection

Method			mAP^novel
Shot	1	2	3	5	10	30
FRCN-ft	1.7	3.1	3.7	4.6	5.5	7.4
TFA	1.9	3.9	5.1	7	9.1	12.1
FSDetView	3.2	4.9	6.7	8.1	10.7	15.9
DeFRCN (Our Paper)	4.8	8.5	10.7	13.6	16.8	21.2
DeFRCN (This Repo)	4.8	8.5	10.7	13.5	16.7	21.0

* indicates that the results are reproduced by us with their source code.
It's normal to observe -0.3~+0.3AP noise between your results and this repo.
The results of mAP^base and mAP^all for G-FSOD are list here GoogleDrive, BaiduYun.
If you have any problem of above results in this repo, you can download configs and train logs from GoogleDrive, BaiduYun.

Results on VOC Benchmark

Few-shot Object Detection

Method			Split-1					Split-2					Split-3
Shot	1	2	3	5	10	1	2	3	5	10	1	2	3	5	10
YOLO-ft	6.6	10.7	12.5	24.8	38.6	12.5	4.2	11.6	16.1	33.9	13.0	15.9	15.0	32.2	38.4
FRCN-ft	13.8	19.6	32.8	41.5	45.6	7.9	15.3	26.2	31.6	39.1	9.8	11.3	19.1	35.0	45.1
FSRW	14.8	15.5	26.7	33.9	47.2	15.7	15.2	22.7	30.1	40.5	21.3	25.6	28.4	42.8	45.9
MetaDet	18.9	20.6	30.2	36.8	49.6	21.8	23.1	27.8	31.7	43.0	20.6	23.9	29.4	43.9	44.1
MetaR-CNN	19.9	25.5	35.0	45.7	51.5	10.4	19.4	29.6	34.8	45.4	14.3	18.2	27.5	41.2	48.1
TFA	39.8	36.1	44.7	55.7	56.0	23.5	26.9	34.1	35.1	39.1	30.8	34.8	42.8	49.5	49.8
MPSR	41.7	-	51.4	55.2	61.8	24.4	-	39.2	39.9	47.8	35.6	-	42.3	48.0	49.7
DeFRCN (Our Paper)	53.6	57.5	61.5	64.1	60.8	30.1	38.1	47.0	53.3	47.9	48.4	50.9	52.3	54.9	57.4
DeFRCN (This Repo)	55.1	57.4	61.1	64.6	61.5	32.1	40.5	47.9	52.9	47.5	48.9	51.9	52.3	55.7	59.0

Generalized Few-shot Object Detection

Method			Split-1					Split-2					Split-3
Shot	1	2	3	5	10	1	2	3	5	10	1	2	3	5	10
FRCN-ft	9.9	15.6	21.6	28.0	52.0	9.4	13.8	17.4	21.9	39.7	8.1	13.9	19	23.9	44.6
FSRW	14.2	23.6	29.8	36.5	35.6	12.3	19.6	25.1	31.4	29.8	12.5	21.3	26.8	33.8	31.0
TFA	25.3	36.4	42.1	47.9	52.8	18.3	27.5	30.9	34.1	39.5	17.9	27.2	34.3	40.8	45.6
FSDetView	24.2	35.3	42.2	49.1	57.4	21.6	24.6	31.9	37.0	45.7	21.2	30.0	37.2	43.8	49.6
DeFRCN (Our Paper)	40.2	53.6	58.2	63.6	66.5	29.5	39.7	43.4	48.1	52.8	35.0	38.3	52.9	57.7	60.8
DeFRCN (This Repo)	43.8	57.5	61.4	65.3	67.0	31.5	40.9	45.6	50.1	52.9	38.2	50.9	54.1	59.2	61.9

Note that we change the λ^GDL-RCNN for VOC to 0.001 (0.01 in paper) and get better performance, check the configs for more details.
The results of mAP^base and mAP^all for G-FSOD are list here GoogleDrive, BaiduYun.
If you have any problem of above results in this repo, you can download configs and logs from GoogleDrive, BaiduYun.

Acknowledgement

This repo is developed based on TFA and Detectron2. Please check them for more details and features.

Citing

If you use this work in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@inproceedings{qiao2021defrcn,
  title={DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection},
  author={Qiao, Limeng and Zhao, Yuxuan and Li, Zhiyuan and Qiu, Xi and Wu, Jianan and Zhang, Chi},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={8681--8690},
  year={2021}
}

PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection.

Related tags

Overview

Introduction

Updates!!

Quick Start

Results on COCO Benchmark

Results on VOC Benchmark

Acknowledgement

Citing

Owner

You Only Look Once for Panopitic Driving Perception

Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)

An addon uses SMPL's poses and global translation to drive cartoon character in Blender.

Users can free try their models on SIDD dataset based on this code

Visualize Camera's Pose Using Extrinsic Parameter by Plotting Pyramid Model on 3D Space

CVAT is free, online, interactive video and image annotation tool for computer vision

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data

Code release for ConvNeXt model

This repo is about implementing different approaches of pose estimation and also is a sub-task of the smart hospital bed project :smile:

Semantic Image Synthesis with SPADE

Corgis are the cutest creatures; have 30K of them!

DyNet: The Dynamic Neural Network Toolkit

A repository for storing njxzc final exam review material

RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering

Code for EMNLP'21 paper "Types of Out-of-Distribution Texts and How to Detect Them"

COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping

Pansharpening by convolutional neural networks in the full resolution framework

Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Generalized Data Weighting via Class-level Gradient Manipulation

More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval