Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020)

Last update: Jan 06, 2023

Related tags

Overview

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020)

Official implementation of:

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation
Jialian Wu, Liangchen Song, Tiancai Wang, Qian Zhang and Junsong Yuan
In ACM International Conference on Multimedia , Seattle WA, October 12-16, 2020.

Many thanks to mmdetection authors for their great framework!

News

Mar 2, 2021 Update: We test Forest R-CNN on LVIS v1.0 set. Thanks for considering comparing with our method :)

Jan 1, 2021 Update: We propose Forest DetSeg, an extension of original Forest R-CNN. Forest DetSeg extends the proposed method to RetinaNet. While the new work is under review now, the code has been available. More details will come up along with the new paper.

Installation

Please refer to INSTALL.md for installation and dataset preparation.

Forest R-CNN

Inference

# Examples
# single-gpu testing
python tools/test.py configs/lvis/forest_rcnn_r50_fpn.py forest_rcnn_res50.pth --out out.pkl --eval bbox segm

# multi-gpu testing
./tools/dist_test.sh configs/lvis/forest_rcnn_r50_fpn.py forest_rcnn_res50.pth ${GPU_NUM} --out out.pkl --eval bbox segm

Training

# Examples
# single-gpu training
python tools/train.py configs/lvis/forest_rcnn_r50_fpn.py --validate

# multi-gpu training
./tools/dist_train.sh configs/lvis/forest_rcnn_r50_fpn.py ${GPU_NUM} --validate

(Note that we found in our experiments the best result comes up around the 20-th epoch instead of the end of training.)

Forest RetinaNet

Inference

# Examples  
# multi-gpu testing
./tools/dist_test.sh configs/lvis/forest_retinanet_r50_fpn_1x.py forest_retinanet_res50.pth ${GPU_NUM} --out out.pkl --eval bbox segm

Training

# Examples    
# multi-gpu training
./tools/dist_train.sh configs/lvis/forest_retinanet_r50_fpn_1x.py ${GPU_NUM} --validate

Main Results

Instance Segmentation on LVIS v0.5 val set

AP and AP.b denote the mask AP and box AP. r, c, f represent the rare, common, frequent contegoires.

Method	Backbone	AP	AP.r	AP.c	AP.f	AP.b	AP.b.r	AP.b.c	AP.b.f	download
MaskRCNN	R50-FPN	21.7	6.8	22.6	26.4	21.8	6.5	21.6	28.0	model
Forest R-CNN	R50-FPN	25.6	18.3	26.4	27.6	25.9	16.9	26.1	29.2	model
MaskRCNN	R101-FPN	23.6	10.0	24.8	27.6	23.5	8.7	23.1	29.8	model
Forest R-CNN	R101-FPN	26.9	20.1	27.9	28.3	27.5	20.0	27.5	30.4	model
MaskRCNN	X-101-32x4d-FPN	24.8	10.0	26.4	28.6	24.8	8.6	25.0	30.9	model
Forest R-CNN	X-101-32x4d-FPN	28.5	21.6	29.7	29.7	28.8	20.6	29.2	31.7	model

Instance Segmentation on LVIS v1.0 val set

Method	Backbone	AP	AP.r	AP.c	AP.f	AP.b
MaskRCNN	R50-FPN	19.2	0.0	17.2	29.5	20.0
Forest R-CNN	R50-FPN	23.2	14.2	22.7	27.7	24.6

Visualized Examples

Citation

If you find it useful in your research, please consider citing our paper as follows:

@inproceedings{wu2020forest,
title={Forest R-CNN: Large-vocabulary long-tailed object detection and instance segmentation},
author={Wu, Jialian and Song, Liangchen and Wang, Tiancai and Zhang, Qian and Yuan, Junsong},
booktitle={Proceedings of the 28th ACM International Conference on Multimedia},
pages={1570--1578},
year={2020}}

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020)

Related tags

Overview

Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation (ACM MM 2020)

News

Installation

Forest R-CNN

Inference

Training

Forest RetinaNet

Inference

Training

Main Results

Instance Segmentation on LVIS v0.5 val set

Instance Segmentation on LVIS v1.0 val set

Visualized Examples

Citation

Owner

Jialian Wu

Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021)

Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

Car Price Predictor App used to predict the price of the car based on certain input parameters created using python's scikit-learn, fastapi, numpy and joblib packages.

SurfEmb (CVPR 2022) - SurfEmb: Dense and Continuous Correspondence Distributions

Synthetic Humans for Action Recognition, IJCV 2021

FANet - Real-time Semantic Segmentation with Fast Attention

The fastest way to visualize GradCAM with your Keras models.

PyTorch Implementation of CycleGAN and SSGAN for Domain Transfer (Minimal)

Implementation of PersonaGPT Dialog Model

An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come

Code for visualizing the loss landscape of neural nets

Self-Learning - Books Papers, Courses & more I have to learn soon

Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes, ICCV 2017

这是一个mobilenet-yolov4-lite的库，把yolov4主干网络修改成了mobilenet，修改了Panet的卷积组成，使参数量大幅度缩小。

Hand Gesture Volume Control | Open CV | Computer Vision

Pytorch implementation of the paper "Topic Modeling Revisited: A Document Graph-based Neural Network Perspective"

Official PyTorch Implementation of Embedding Transfer with Label Relaxation for Improved Metric Learning, CVPR 2021

TransReID: Transformer-based Object Re-Identification

BraTs-VNet - BraTS(Brain Tumour Segmentation) using V-Net

TorchPQ is a python library for Approximate Nearest Neighbor Search (ANNS) and Maximum Inner Product Search (MIPS) on GPU using Product Quantization (PQ) algorithm.