Official PyTorch implementation of RIO

Last update: May 20, 2022

Overview

Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection

Figure 1: Our proposed Resampling at image-level and obect-level (RIO).

Project page | Paper

Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection.
Nadine Chang, Zhiding Yu, Yu-Xiong Wang, Anima Anandkumar, Sanja Fidler, Jose M. Alvarez.
ICML 2021.

This repository contains the official Pytorch implementation of training & evaluation code and the pretrained models for RIO.

Abstract

Training on datasets with long-tailed distributions has been challenging for major recognition tasks such as classification and detection. To deal with this challenge, image resampling is typically introduced as a simple but effective approach. However, we observe that long-tailed detection differs from classification since multiple classes may be present in one image. As a result, image resampling alone is not enough to yield a sufficiently balanced distribution at the object level. We address object-level resampling by introducing an object-centric memory replay strategy based on dynamic, episodic memory banks. Our proposed strategy has two benefits: 1) convenient object-level resampling without significant extra computation, and 2) implicit feature-level augmentation from model updates. We show that image-level and object-level resamplings are both important, and thus unify them with a joint resampling strategy (RIO). Our method outperforms state-of-the-art long-tailed detection and segmentation methods on LVIS v0.5 across various backbones.

Requirements

Linux or maxOS with Python >= 3.6
PyTorch >= 1.5 and torchvision corresponding to PyTorch installation. Please refer to download guildlines at the PyTorch website
Detectron2
OpenCV is optional but required for visualizations

Installation

Detectron2

Please refer to the installation instructions in Detectron2.

We use Detectron2 v0.3 as the codebase. Thus, we advise installing Detectron2 from a clone of this repository.

LVIS Dataset

Dataset download is available at the official LVIS website. Please follow Detectron's guildlines on expected LVIS dataset structure.

Our Setup

Python 3.6.9
PyTorch 1.5.0 with CUDA 10.2
Detectron2 built from this repository.

Pretrained Models

Detection and Instance Segmentation on LVIS v0.5

Backbone	Method	AP.b	AP.b.r	AP.b.c	AP.b.f	AP.m	AP.m.r	AP.m.c	AP.m.f	download
R50-FPN	MaskRCNN-RIO	25.7	17.2	25.1	29.8	26.0	18.9	26.2	28.5	model
R101-FPN	MaskRCNN-RIO	27.3	19.1	26.8	31.2	27.7	20.1	28.3	30.0	model
X101-FPN	MaskRCNN-RIO	28.6	19.0	28.0	33.0	28.9	19.5	29.7	31.6	model

Training & Evaluation

Our code is located under projects/RIO.

Our training and evaluation follows those of Detectron2's. We've provided config files for both LVISv0.5 and LVISv1.0.

Example: Training LVISv0.5 on Mask-RCNN ResNet-50

# We advise multi-gpu training
cd projects/RIO
python memory_train_net.py \
--num-gpus 4 \
--config-file=configs/LVISv0.5-InstanceSegmentation/memory_mask_rcnn_R_50_FPN_1x.yaml

Example: Evaluating LVISv0.5 on Mask-RCNN ResNet-50

cd projects/RIO
python memory_train_net.py \
--eval-only MODEL.WEIGHTS /path/to/model_checkpoint \
--config-file configs/LVISv0.5-InstanceSegmentation/memory_mask_rcnn_R_50_FPN_1x.yaml

By default, LVIS evaluation follows immediately after training.

Visualization

Detectron2 has built-in visualization tools. Under tools folder, visualize_json_results.py can be used to visualize the json instance detection/segmentation results given by LVISEvaluator.

python visualize_json_results.py --input x.json --output dir/ --dataset lvis

Further information can be found on Detectron2 tools' README.

License

Please check the LICENSE file. RIO may be used non-commercially, meaning for research or evaluation purposes only. For business inquiries, please contact [email protected].

Citation

@article{chang2021image,
  title={Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection},
  author={Chang, Nadine and Yu, Zhiding and Wang, Yu-Xiong and Anandkumar, Anima and Fidler, Sanja and Alvarez, Jose M},
  journal={arXiv preprint arXiv:2104.05702},
  year={2021}
}

Official PyTorch implementation of RIO

Related tags

Overview

Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection

Project page | Paper

Abstract

Requirements

Installation

Detectron2

LVIS Dataset

Our Setup

Pretrained Models

Training & Evaluation

Visualization

License

Citation

Owner

NVIDIA Research Projects

Code for paper "Context-self contrastive pretraining for crop type semantic segmentation"

A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models.

Implementation of ProteinBERT in Pytorch

PyTorch implementation of the TTC algorithm

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

This is the official implementation for "Do Transformers Really Perform Bad for Graph Representation?".

Keyword-BERT: Keyword-Attentive Deep Semantic Matching

This is the second place solution for : UmojaHack Africa 2022: African Snake Antivenom Binding Challenge

📚 A collection of all the Deep Learning Metrics that I came across which are not accuracy/loss.

Official implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

A simple Python library for stochastic graphical ecological models

Towards Flexible Blind JPEG Artifacts Removal (FBCNN, ICCV 2021)

Speech Recognition using DeepSpeech2.

Implementation of Cross Transformer for spatially-aware few-shot transfer, in Pytorch

MoCap-Solver: A Neural Solver for Optical Motion Capture Data

Depression Asisstant GDSC Challenge Solution

An end-to-end framework for mixed-integer optimization with data-driven learned constraints.

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

2021 CCF BDCI 全国信息检索挑战杯（CCIR-Cup）智能人机交互自然语言理解赛道第二名参赛解决方案

Multi-angle c(q)uestion answering