PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection.

Related tags

Deep LearningDeFRCN
Overview

Introduction

This repo contains the official PyTorch implementation of our ICCV paper DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection.

Updates!!

  • 【2021/10/10】 We release the official PyTorch implementation of DeFRCN.
  • 【2021/08/20】 We have uploaded our paper (long version with supplementary material) on arxiv, review it for more details.

Quick Start

1. Check Requirements

  • Linux with Python >= 3.6
  • PyTorch >= 1.6 & torchvision that matches the PyTorch version.
  • CUDA 10.1, 10.2
  • GCC >= 4.9

2. Build DeFRCN

  • Clone Code
    git clone https://github.com/er-muyue/DeFRCN.git
    cd DeFRCN
    
  • Create a virtual environment (optional)
    virtualenv defrcn
    cd /path/to/venv/defrcn
    source ./bin/activate
    
  • Install PyTorch 1.6.0 with CUDA 10.1
    pip3 install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
  • Install Detectron2
    python3 -m pip install detectron2==0.3 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.6/index.html
    
    • If you use other version of PyTorch/CUDA, check the latest version of Detectron2 in this page: Detectron2.
    • Sorry for that I don’t have enough time to test on more versions, if you run into problems with other versions, please let me know.
  • Install other requirements.
    python3 -m pip install -r requirements.txt
    

3. Prepare Data and Weights

  • Data Preparation
    • We evaluate our models on two datasets for both FSOD and G-FSOD settings:

      Dataset Size GoogleDrive BaiduYun Note
      VOC2007 0.8G download download -
      VOC2012 3.5G download download -
      vocsplit <1M download download refer from TFA
      COCO ~19G - - download from offical
      cocosplit 174M download download refer from TFA
    • Unzip the downloaded data-source to datasets and put it into your project directory:

        ...
        datasets
          | -- coco (trainval2014/*.jpg, val2014/*.jpg, annotations/*.json)
          | -- cocosplit
          | -- VOC2007
          | -- VOC2012
          | -- vocsplit
        defrcn
        tools
        ...
      
  • Weights Preparation
    • We use the imagenet pretrain weights to initialize our model. Download the same models from here: GoogleDrive BaiduYun
    • The extract code for all BaiduYun link is 0000

4. Training and Evaluation

For ease of training and evaluation over multiple runs, we integrate the whole pipeline of few-shot object detection into one script run_*.sh, including base pre-training and novel-finetuning (both FSOD and G-FSOD).

  • To reproduce the results on VOC, EXP_NAME can be any string (e.g defrcn, or something) and SPLIT_ID must be 1 or 2 or 3 (we consider 3 random splits like other papers).
    bash run_voc.sh EXP_NAME SPLIT_ID (1, 2 or 3)
    
  • To reproduce the results on COCO, EXP_NAME can be any string (e.g defrcn, or something)
    bash run_coco.sh EXP_NAME
    
  • Please read the details of few-shot object detection pipeline in run_*.sh, you need change IMAGENET_PRETRAIN* to your path.

Results on COCO Benchmark

  • Few-shot Object Detection

    Method mAPnovel
    Shot 1 2 3 5 10 30
    FRCN-ft 1.0* 1.8* 2.8* 4.0* 6.5 11.1
    FSRW - - - - 5.6 9.1
    MetaDet - - - - 7.1 11.3
    MetaR-CNN - - - - 8.7 12.4
    TFA 4.4* 5.4* 6.0* 7.7* 10.0 13.7
    MPSR 5.1* 6.7* 7.4* 8.7* 9.8 14.1
    FSDetView 4.5 6.6 7.2 10.7 12.5 14.7
    DeFRCN (Our Paper) 9.3 12.9 14.8 16.1 18.5 22.6
    DeFRCN (This Repo) 9.7 13.1 14.5 15.6 18.4 22.6
  • Generalized Few-shot Object Detection

    Method mAPnovel
    Shot 1 2 3 5 10 30
    FRCN-ft 1.7 3.1 3.7 4.6 5.5 7.4
    TFA 1.9 3.9 5.1 7 9.1 12.1
    FSDetView 3.2 4.9 6.7 8.1 10.7 15.9
    DeFRCN (Our Paper) 4.8 8.5 10.7 13.6 16.8 21.2
    DeFRCN (This Repo) 4.8 8.5 10.7 13.5 16.7 21.0
  • * indicates that the results are reproduced by us with their source code.
  • It's normal to observe -0.3~+0.3AP noise between your results and this repo.
  • The results of mAPbase and mAPall for G-FSOD are list here GoogleDrive, BaiduYun.
  • If you have any problem of above results in this repo, you can download configs and train logs from GoogleDrive, BaiduYun.

Results on VOC Benchmark

  • Few-shot Object Detection

    Method Split-1 Split-2 Split-3
    Shot 1 2 3 5 10 1 2 3 5 10 1 2 3 5 10
    YOLO-ft 6.6 10.7 12.5 24.8 38.6 12.5 4.2 11.6 16.1 33.9 13.0 15.9 15.0 32.2 38.4
    FRCN-ft 13.8 19.6 32.8 41.5 45.6 7.9 15.3 26.2 31.6 39.1 9.8 11.3 19.1 35.0 45.1
    FSRW 14.8 15.5 26.7 33.9 47.2 15.7 15.2 22.7 30.1 40.5 21.3 25.6 28.4 42.8 45.9
    MetaDet 18.9 20.6 30.2 36.8 49.6 21.8 23.1 27.8 31.7 43.0 20.6 23.9 29.4 43.9 44.1
    MetaR-CNN 19.9 25.5 35.0 45.7 51.5 10.4 19.4 29.6 34.8 45.4 14.3 18.2 27.5 41.2 48.1
    TFA 39.8 36.1 44.7 55.7 56.0 23.5 26.9 34.1 35.1 39.1 30.8 34.8 42.8 49.5 49.8
    MPSR 41.7 - 51.4 55.2 61.8 24.4 - 39.2 39.9 47.8 35.6 - 42.3 48.0 49.7
    DeFRCN (Our Paper) 53.6 57.5 61.5 64.1 60.8 30.1 38.1 47.0 53.3 47.9 48.4 50.9 52.3 54.9 57.4
    DeFRCN (This Repo) 55.1 57.4 61.1 64.6 61.5 32.1 40.5 47.9 52.9 47.5 48.9 51.9 52.3 55.7 59.0
  • Generalized Few-shot Object Detection

    Method Split-1 Split-2 Split-3
    Shot 1 2 3 5 10 1 2 3 5 10 1 2 3 5 10
    FRCN-ft 9.9 15.6 21.6 28.0 52.0 9.4 13.8 17.4 21.9 39.7 8.1 13.9 19 23.9 44.6
    FSRW 14.2 23.6 29.8 36.5 35.6 12.3 19.6 25.1 31.4 29.8 12.5 21.3 26.8 33.8 31.0
    TFA 25.3 36.4 42.1 47.9 52.8 18.3 27.5 30.9 34.1 39.5 17.9 27.2 34.3 40.8 45.6
    FSDetView 24.2 35.3 42.2 49.1 57.4 21.6 24.6 31.9 37.0 45.7 21.2 30.0 37.2 43.8 49.6
    DeFRCN (Our Paper) 40.2 53.6 58.2 63.6 66.5 29.5 39.7 43.4 48.1 52.8 35.0 38.3 52.9 57.7 60.8
    DeFRCN (This Repo) 43.8 57.5 61.4 65.3 67.0 31.5 40.9 45.6 50.1 52.9 38.2 50.9 54.1 59.2 61.9
  • Note that we change the λGDL-RCNN for VOC to 0.001 (0.01 in paper) and get better performance, check the configs for more details.

  • The results of mAPbase and mAPall for G-FSOD are list here GoogleDrive, BaiduYun.

  • If you have any problem of above results in this repo, you can download configs and logs from GoogleDrive, BaiduYun.

Acknowledgement

This repo is developed based on TFA and Detectron2. Please check them for more details and features.

Citing

If you use this work in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@inproceedings{qiao2021defrcn,
  title={DeFRCN: Decoupled Faster R-CNN for Few-Shot Object Detection},
  author={Qiao, Limeng and Zhao, Yuxuan and Li, Zhiyuan and Qiu, Xi and Wu, Jianan and Zhang, Chi},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={8681--8690},
  year={2021}
}
Pretrained models for Jax/Haiku; MobileNet, ResNet, VGG, Xception.

Pre-trained image classification models for Jax/Haiku Jax/Haiku Applications are deep learning models that are made available alongside pre-trained we

Alper Baris CELIK 14 Dec 20, 2022
Implementation of Kronecker Attention in Pytorch

Kronecker Attention Pytorch Implementation of Kronecker Attention in Pytorch. Results look less than stellar, but if someone found some context where

Phil Wang 16 May 06, 2022
An efficient and easy-to-use deep learning model compression framework

TinyNeuralNetwork 简体中文 TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework, which contains features like neura

Alibaba 441 Dec 25, 2022
Implementation for the EMNLP 2021 paper "Interactive Machine Comprehension with Dynamic Knowledge Graphs".

Interactive Machine Comprehension with Dynamic Knowledge Graphs Implementation for the EMNLP 2021 paper. Dependencies apt-get -y update apt-get instal

Xingdi (Eric) Yuan 19 Aug 23, 2022
Implementation of "Semi-supervised Domain Adaptive Structure Learning"

Semi-supervised Domain Adaptive Structure Learning - ASDA This repo contains the source code and dataset for our ASDA paper. Illustration of the propo

3 Dec 13, 2021
GDSC-ML Team Interview Task

GDSC-ML-Team---Interview-Task Task 1 : Clean or Messy room In this task we have to classify the given test images as clean or messy. - Link for datase

Aayush. 1 Jan 19, 2022
A set of tools for converting a darknet dataset to COCO format working with YOLOX

darknet格式数据→COCO darknet训练数据目录结构(详情参见dataset/darknet): darknet ├── class.names ├── gen_config.data ├── gen_train.txt ├── gen_valid.txt └── images

RapidAI-NG 148 Jan 03, 2023
Generative code template for PixelBeasts 10k NFT project.

generator-template Generative code template for combining transparent png attributes into 10,000 unique images. Used for the PixelBeasts 10k NFT proje

Yohei Nakajima 9 Aug 24, 2022
SegNet model implemented using keras framework

keras-segnet Implementation of SegNet-like architecture using keras. Current version doesn't support index transferring proposed in SegNet article, so

185 Aug 30, 2022
Source code of SIGIR2021 Paper 'One Chatbot Per Person: Creating Personalized Chatbots based on Implicit Profiles'

DHAP Source code of SIGIR2021 Long Paper: One Chatbot Per Person: Creating Personalized Chatbots based on Implicit User Profiles . Preinstallation Fir

ZYMa 32 Dec 06, 2022
existing and custom freqtrade strategies supporting the new hyperstrategy format.

freqtrade-strategies Description Existing and self-developed strategies, rewritten to support the new HyperStrategy format from the freqtrade-develop

39 Aug 20, 2021
Official code implementation for "Personalized Federated Learning using Hypernetworks"

Personalized Federated Learning using Hypernetworks This is an official implementation of Personalized Federated Learning using Hypernetworks paper. [

Aviv Shamsian 121 Dec 25, 2022
A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains (IJCV submission)

wsss-analysis The code of: A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains, arXiv pre-print 2019 paper.

Lyndon Chan 48 Dec 18, 2022
3D position tracking for soccer players with multi-camera videos

This repo contains a full pipeline to support 3D position tracking of soccer players, with multi-view calibrated moving/fixed video sequences as inputs.

Yuchang Jiang 72 Dec 27, 2022
Vehicle Detection Using Deep Learning and YOLO Algorithm

VehicleDetection Vehicle Detection Using Deep Learning and YOLO Algorithm Dataset take or find vehicle images for create a special dataset for fine-tu

Maryam Boneh 96 Jan 05, 2023
Miscellaneous and lightweight network tools

Network Tools Collection of miscellaneous and lightweight network tools to simplify daily operations, administration, and troubleshooting of networks.

Nicholas Russo 22 Mar 22, 2022
The GitHub repository for the paper: “Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction“.

SCINet This is the original PyTorch implementation of the following work: Time Series is a Special Sequence: Forecasting with Sample Convolution and I

386 Jan 01, 2023
Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021)

Pano-AVQA Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021) [Paper] [Poster] [Video] Getting Starte

Heeseung Yun 9 Dec 23, 2022
Benchmark for Answering Existential First Order Queries with Single Free Variable

EFO-1-QA Benchmark for First Order Query Estimation on Knowledge Graphs This repository contains an entire pipeline for the EFO-1-QA benchmark. EFO-1

HKUST-KnowComp 14 Oct 24, 2022
Code for training and evaluation of the model from "Language Generation with Recurrent Generative Adversarial Networks without Pre-training"

Language Generation with Recurrent Generative Adversarial Networks without Pre-training Code for training and evaluation of the model from "Language G

Amir Bar 253 Sep 14, 2022