Leaderboard, taxonomy, and curated list of few-shot object detection papers.

Overview

Awesome Few-Shot Object Detection (FSOD)

Leaderboard, taxonomy, and curated list of few-shot object detection papers.

Maintainers: Gabriel Huang

For an introduction to the few-shot object detection framework read below, or check our our survey on few-shot and self-supervised object detection and its project page for full explanations, discussions on the pitfalls of the Pascal, COCO, and LVIS benchmarks used below, main takeaways and future research directions.

Contributing

If you want to add your paper or report a mistake, please create a pull request with all supporting information. Thanks!

Pascal VOC and MS COCO FSOD Leaderboard

In this table we distinguish Kang's Splits (Meta-YOLO) from TFA's splits (Frustratingly Simple FSOD), as the Kang splits have been shown to have high variance and overestimate performance for low number of shots (see for yourself -- check the difference between TFA 1-shot and Kang 1-shot in the table below).

Name Type VOC TFA 1-shot (mAP50) VOC TFA 3-shot (mAP50) VOC TFA 10-shot (mAP50) VOC Kang 1-shot (mAP50) VOC Kang 3-shot (mAP50) VOC Kang 10-shot (mAP50) MS COCO 10-shot (mAP) MS COCO 30-shot (mAP)
LSTD finetuning - - - 8.2 12.4 38.5 - -
RepMet prototype - - - 26.1 34.4 41.3 - -
Meta-YOLO modulation 14.2 29.8 - 14.8 26.7 47.2 5.6 9.1
MetaDet modulation - - - 18.9 30.2 49.6 7.1 11.3
Meta-RCNN modulation - - - 19.9 35.0 51.5 8.7 12.4
Faster RCNN+FT finetuning 9.9 21.6 35.6 15.2 29.0 45.5 9.2 12.5
ACM-MetaRCNN modulation - - - 31.9 35.9 53.1 9.4 12.8
TFA w/fc finetuning 22.9 40.4 52.0 36.8 43.6 57.0 10.0 13.4
TFA w/cos finetuning 25.3 42.1 52.8 39.8 44.7 56.0 10.0 13.7
Retentive RCNN finetuning - - - 42.0 46.0 56.0 10.5 13.8
MPSR finetuning - - - 41.7 51.4 61.8 9.8 14.1
Attention-FSOD modulation - - - - - - 12.0 -
FsDetView finetuning 24.2 42.2 57.4 - - - 12.5 14.7
CME finetuning - - - 41.5 50.4 60.9 15.1 16.9
TIP add-on 27.7 43.3 59.6 - - - 16.3 18.3
DAnA modulation - - - - - - 18.6 21.6
DeFRCN prototype - - - 53.6 61.5 60.8 18.5 22.6
Meta-DETR modulation 20.4 46.6 57.8 - - - 17.8 22.9
DETReg finetuning - - - - - - 18.0 30.0

Few-Shot Object Detection Explained

We explain the few-shot object detection framework as defined by the Meta-YOLO paper (Kang's splits - full details here). FSOD partitions objects into two disjoint sets of categories: base or known/source classes, which are object categories for which we have access to a large number of training examples; and novel or unseen/target classes, for which we have only a few training examples (shots) per class. The FSOD task is formalized into the following steps:

  • 1. Base training.¹ Annotations are given only for the base classes, with a large number of training examples per class (bikes in the example). We train the FSOD method on the base classes.
  • 2. Few-shot finetuning. Annotations are given for the support set, a very small number of training examples from both the base and novel classes (one bike and one human in the example). Most methods finetune the FSOD model on the support set, but some methods might only use the support set for conditioning during evaluation (finetuning-free methods).
  • 3. Few-shot evaluation. We evaluate the FSOD to jointly detect base and novel classes from the test set (few-shot refers to the size of the support set). The performance metrics are reported separately for base and novel classes. Common evaluation metrics are variants of the mean average precision: mAP50 for Pascal and COCO-style mAP for COCO. They are often denoted bAP50, bAP75, bAP (resp. nAP50, nAP75, nAP) for the base and novel classes respectively, where the number is the IoU-threshold in percentage.

In pure FSOD, methods are usually compared solely on the basis of novel class performance, whereas in Generalized FSOD, methods are compared on both base and novel class performances [2]. Note that "training" and "test" set refer to the splits used in traditional object detection. Base and novel classes are typically present in both the training and testing sets; however, the novel class annotations are filtered out from the training set during base training; during few-shot finetuning, the support set is typically taken to be a (fixed) subset of the training set; during few-shot evaluation, all of the test set is used to reduce uncertainty [1].

For conditioning-based methods with no finetuning, few-shot finetuning and few-shot evaluation are merged into a single step; the novel examples are used as support examples to condition the model, and predictions are made directly on the test set. In practice, the majority of conditioning-based methods reviewed in this survey do benefit from some form of finetuning.

*¹In the context of self-supervised learning, base-training may also be referred to as finetuning or training. This should not be confused with base training in the meta-learning framework; rather this is similar to the meta-training phase [3].

Owner
Gabriel Huang
PhD student at MILA
Gabriel Huang
This repository is the official implementation of Open Rule Induction. This paper has been accepted to NeurIPS 2021.

Open Rule Induction This repository is the official implementation of Open Rule Induction. This paper has been accepted to NeurIPS 2021. Abstract Rule

Xingran Chen 16 Nov 14, 2022
Official Tensorflow implementation of U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation (ICLR 2020)

U-GAT-IT — Official TensorFlow Implementation (ICLR 2020) : Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization fo

Junho Kim 6.2k Jan 04, 2023
[NeurIPS 2021] Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods

Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods Large Scale Learning on Non-Homophilous Graphs: New Benchmark

60 Jan 03, 2023
Spectral Tensor Train Parameterization of Deep Learning Layers

Spectral Tensor Train Parameterization of Deep Learning Layers This repository is the official implementation of our AISTATS 2021 paper titled "Spectr

Anton Obukhov 12 Oct 23, 2022
Deep and online learning with spiking neural networks in Python

Introduction The brain is the perfect place to look for inspiration to develop more efficient neural networks. One of the main differences with modern

Jason Eshraghian 447 Jan 03, 2023
This repo provides function call to track multi-objects in videos

Custom Object Tracking Introduction This repo provides function call to track multi-objects in videos with a given trained object detection model and

Jeff Lo 51 Nov 22, 2022
[CVPR 2021 Oral] Variational Relational Point Completion Network

VRCNet: Variational Relational Point Completion Network This repository contains the PyTorch implementation of the paper: Variational Relational Point

PL 121 Dec 12, 2022
codes for IKM (arXiv2021, Submitted to IEEE Trans)

Image-specific Convolutional Kernel Modulation for Single Image Super-resolution This repository is for IKM introduced in the following paper Yuanfei

Yuanfei Huang 9 Dec 29, 2022
[ICLR 2021] Is Attention Better Than Matrix Decomposition?

Enjoy-Hamburger 🍔 Official implementation of Hamburger, Is Attention Better Than Matrix Decomposition? (ICLR 2021) Under construction. Introduction T

Gsunshine 271 Dec 29, 2022
This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Introduction This is an official implementation of CvT: Introducing Convolutions to Vision Transformers. We present a new architecture, named Convolut

Bin Xiao 175 Jan 08, 2023
Source code of CIKM2021 Long Paper "PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling".

PSSL Source code of CIKM2021 Long Paper "PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling". It consists of the pre-tra

2 Dec 21, 2021
[TIP2020] Adaptive Graph Representation Learning for Video Person Re-identification

Introduction This is the PyTorch implementation for Adaptive Graph Representation Learning for Video Person Re-identification. Get started git clone h

WuYiming 41 Dec 12, 2022
This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

CLGo This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints An earlier

刘芮金 32 Dec 20, 2022
code for our BMVC 2021 paper "HCV: Hierarchy-Consistency Verification for Incremental Implicitly-Refined Classification"

HCV_IIRC code for our BMVC 2021 paper HCV: Hierarchy-Consistency Verification for Incremental Implicitly-Refined Classification by Kai Wang, Xialei Li

kai wang 13 Oct 03, 2022
A repository built on the Flow software package to explore cyber-security attacks on intelligent transportation systems.

A repository built on the Flow software package to explore cyber-security attacks on intelligent transportation systems.

George Gunter 4 Nov 14, 2022
Video Matting via Consistency-Regularized Graph Neural Networks

Video Matting via Consistency-Regularized Graph Neural Networks Project Page | Real Data | Paper Installation Our code has been tested on Python 3.7,

41 Dec 26, 2022
Laser device for neutralizing - mosquitoes, weeds and pests

Laser device for neutralizing - mosquitoes, weeds and pests (in progress) Here I will post information for creating a laser device. A warning!! How It

Ildaron 1k Jan 02, 2023
这个开源项目主要是对经典的时间序列预测算法论文进行复现,模型主要参考自GluonTS,框架主要参考自Informer

Time Series Research with Torch 这个开源项目主要是对经典的时间序列预测算法论文进行复现,模型主要参考自GluonTS,框架主要参考自Informer。 建立原因 相较于mxnet和TF,Torch框架中的神经网络层需要提前指定输入维度: # 建立线性层 TensorF

Chi Zhang 85 Dec 29, 2022
Simple machine learning library / 簡單易用的機器學習套件

FukuML Simple machine learning library / 簡單易用的機器學習套件 Installation $ pip install FukuML Tutorial Lesson 1: Perceptron Binary Classification Learning Al

Fukuball Lin 279 Sep 15, 2022