Alignment Attention Fusion framework for Few-Shot Object Detection

Overview

AAF framework

Framework generalities

This repository contains the code of the AAF framework proposed in this paper. The main idea behind this work is to propose a flexible framework to implement various attention mechanisms for Few-Shot Object Detection. The framework is composed of 3 different modules: Spatial Alignment, Global Attention and Fusion Layer, which are applied successively to combine features from query and support images.

The inputs of the framework are:

  • query_features List[Tensor(B, C, H, W)]: Query features at different levels. For each level, the features are of shape Batch x Channels x Height x Width.
  • support_features List[Tensor(N, C, H', W')] : Support features at different level. First dimension correspond to the number of support images, regrouped by class: N = N_WAY * K_SHOT.
  • support_targets List[BoxList] bounding boxes for object in each support image.

The framework can be configured using a separate config file. Examples of such files are available under /config_files/aaf_framework/. The structure of these files is simple:

ALIGN_FIRST: #True/False Run Alignment before Attention when True
OUT_CH: # Number of features output by the fusion layer
ALIGNMENT:
    MODE: # Name of the alignment module selected
ATTENTION:
    MODE: # Name of the attention module selected
FUSION:
    MODE: # Name of the fusion module selected
File name Method Alignment Attention Fusion
identity.yaml Identity IDENTITY IDENTITY IDENTITY
feature_reweighting.yaml FSOD via feature reweighting IDENTITY REWEIGHTING_BATCH IDENTITY
meta_faster_rcnn.yaml Meta Faster-RCNN SIMILARITY_ALIGN META_FASTER META_FASTER
self_adapt.yaml Self-adaptive attention for FSOD IDENTITY_NO_REPEAT GRU IDENTITY
dynamic.yaml Dynamic relevance learning IDENTITY INTERPOLATE DYNAMIC_R
dana.yaml Dual Awarness Attention for FSOD CISA BGA HADAMARD

The path to the AAF config file should be specified inside the master config file (i.e. for the whole network) under FEWSHOT.AAF.CFG.

For each module, classes implementing the available choices are regrouped under a single file: /modelling/aaf/alignment.py, /modelling/aaf/attention.py and /modelling/aaf/fusion.py.

Spatial Alignment

Spatial Alignment reorganizes spatially the features of one feature map to match another one. The idea is to align similar features in both maps so that comparison is easier.

Name Description
IDENTITY Repeats the feature to match BNCHW and NBCHW dimensions
IDENTITY_NO_REPEAT Identity without repetition
SIMILARITY_ALIGN Compute similarity matrix between support and query and align support to query accordingly.
CISA CISA block from this method

### Global Attention Global Attention highlights some features of a map accordingly to an attention vector computed globally on another one. The idea is to leverage global and hopefully semantic information.

Name Description
IDENTITY Simply pass features to next modules.
REWEIGHTING Reweights query features using globally pooled vectors from support.
REWEIGHTING_BATCH Same as above but support examples are the same for the whole batch.
SELF_ATTENTION Same as above but attention vectors are computed from the alignment matrix between query and support.
BGA BGA blocks from this method
META_FASTER Attention block from this method
POOLING Pools query and support features to the same size.
INTERPOLATE Upsamples support features to match query size.
GRU Computes attention vectors through a graph representation using a GRU.

Fusion Layer

Combine directly the features from support and query. These maps must be of the same dimension for point-wise operation. Hence fusion is often employed along with alignment.

Name Description
IDENTITY Returns onlu adapted query features.
ADD Point-wise sum between query and support features.
HADAMARD Point-wise multiplication between query and support features.
SUBSTRACT Point-wise substraction between query and support features.
CONCAT Channel concatenation of query and support features.
META_FASTER Fusion layer from this method
DYNAMIC_R Fusion layer from this method

Training and evaluation

Training and evaluation scripts are available.

TODO: Give code snippet to run training with a specified config file (modify main) Basically create 2 scripts train.py and eval.py with arg config file.

DataHandler

Explain DataHandler class a bit.

Installation

Dependencies used for this projects can be installed through conda create --name <env> --file requirements.txt. Please note that these requirements are not all necessary and it will be updated soon.

FCOS must be installed from sources. But there might be some issue after installation depending on the version of the python packages you use.

  • cpu/vision.h file not found: replace all occurences in the FCOS source by vision.h (see this issue).
  • Error related to AT_CHECK with pytorch > 1.5 : replace all occurences by TORCH_CHECK (see this issue.
  • Error related to torch._six.PY36: replace all occurence of PY36 by PY37.

Results

Results on pascal VOC, COCO and DOTA.

Owner
Pierre Le Jeune
PhD Student in Few-shot object detection.
Pierre Le Jeune
Discriminative Condition-Aware PLDA

DCA-PLDA This repository implements the Discriminative Condition-Aware Backend described in the paper: L. Ferrer, M. McLaren, and N. Brümmer, "A Speak

Luciana Ferrer 31 Aug 05, 2022
Implementation of GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation (ICLR 2022).

GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation [OpenReview] [arXiv] [Code] The official implementation of GeoDiff: A Geome

Minkai Xu 155 Dec 26, 2022
Build fully-functioning computer vision models with PyTorch

Detecto is a Python package that allows you to build fully-functioning computer vision and object detection models with just 5 lines of code. Inferenc

Alan Bi 576 Dec 29, 2022
Code for Active Learning at The ImageNet Scale.

Code for Active Learning at The ImageNet Scale. This repository implements many popular active learning algorithms and allows training with torch's DDP.

Zeyad Emam 47 Dec 12, 2022
[CVPR 2022] "The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy" by Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy Codes for this paper: [CVPR 2022] The Pr

VITA 16 Nov 26, 2022
A Lightweight Experiment & Resource Monitoring Tool 📺

Lightweight Experiment & Resource Monitoring 📺 "Did I already run this experiment before? How many resources are currently available on my cluster?"

170 Dec 28, 2022
A proof of concept ai-powered Recaptcha v2 solver

Recaptcha Fullauto I've decided to open source my old Recaptcha v2 solver. My latest version will be opened sourced this summer. I am hoping this proj

Nate 60 Dec 20, 2022
Implementation of the method proposed in the paper "Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation"

Neural Descriptor Fields (NDF) PyTorch implementation for training continuous 3D neural fields to represent dense correspondence across objects, and u

167 Jan 06, 2023
A modular, open and non-proprietary toolkit for core robotic functionalities by harnessing deep learning

A modular, open and non-proprietary toolkit for core robotic functionalities by harnessing deep learning Website • About • Installation • Using OpenDR

OpenDR 304 Dec 28, 2022
DLWP: Deep Learning Weather Prediction

DLWP: Deep Learning Weather Prediction DLWP is a Python project containing data-

Kushal Shingote 3 Aug 14, 2022
Neural Network to colorize grayscale images

#colornet Neural Network to colorize grayscale images Results Grayscale Prediction Ground Truth Eiji K used colornet for anime colorization Sources Au

Pavel Hanchar 3.6k Dec 24, 2022
A containerized REST API around OpenAI's CLIP model.

OpenAI's CLIP — REST API This is a container wrapping OpenAI's CLIP model in a RESTful interface. Running the container locally First, build the conta

Santiago Valdarrama 48 Nov 06, 2022
Turning pixels into virtual points for multimodal 3D object detection.

Multimodal Virtual Point 3D Detection Turning pixels into virtual points for multimodal 3D object detection. Multimodal Virtual Point 3D Detection, Ti

Tianwei Yin 204 Jan 08, 2023
POT : Python Optimal Transport

POT: Python Optimal Transport This open source Python library provide several solvers for optimization problems related to Optimal Transport for signa

Python Optimal Transport 1.7k Dec 31, 2022
Coarse implement of the paper "A Simultaneous Denoising and Dereverberation Framework with Target Decoupling", On DNS-2020 dataset, the DNSMOS of first stage is 3.42 and second stage is 3.47.

SDDNet Coarse implement of the paper "A Simultaneous Denoising and Dereverberation Framework with Target Decoupling", On DNS-2020 dataset, the DNSMOS

Cyril Lv 43 Nov 21, 2022
Source code of the paper "Deep Learning of Latent Variable Models for Industrial Process Monitoring".

Source code of the paper "Deep Learning of Latent Variable Models for Industrial Process Monitoring".

Xiangyin Kong 7 Nov 08, 2022
Deploy a ML inference service on a budget in less than 10 lines of code.

BudgetML is perfect for practitioners who would like to quickly deploy their models to an endpoint, but not waste a lot of time, money, and effort trying to figure out how to do this end-to-end.

1.3k Dec 25, 2022
A hifiasm fork for metagenome assembly using Hifi reads.

hifiasm_meta - de novo metagenome assembler, based on hifiasm, a haplotype-resolved de novo assembler for PacBio Hifi reads.

44 Jul 10, 2022
Vector Neurons: A General Framework for SO(3)-Equivariant Networks

Vector Neurons: A General Framework for SO(3)-Equivariant Networks Created by Congyue Deng, Or Litany, Yueqi Duan, Adrien Poulenard, Andrea Tagliasacc

Congyue Deng 332 Dec 29, 2022
Experiments for distributed optimization algorithms

Network-Distributed Algorithm Experiments -- This repository contains a set of optimization algorithms and objective functions, and all code needed to

Boyue Li 40 Dec 04, 2022