GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

Last update: Dec 29, 2022

Overview

GCNet for Object Detection

By Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu.

This repo is a official implementation of "GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond" on COCO object detection based on open-mmlab's mmdetection. The core operator GC block could be find here. Many thanks to mmdetection for their simple and clean framework.

Update on 2020/12/07

The extension of GCNet got accepted by TPAMI (PDF).

Update on 2019/10/28

GCNet won the Best Paper Award at ICCV 2019 Neural Architects Workshop!

Update on 2019/07/01

The code is refactored. More results are provided and all configs could be found in configs/gcnet.

Notes: Both PyTorch official SyncBN and Apex SyncBN have some stability issues. During training, mAP may drops to zero and back to normal during last few epochs.

Update on 2019/06/03

GCNet is supported by the official mmdetection repo here. Thanks again for open-mmlab's work on open source projects.

Introduction

GCNet is initially described in arxiv. Via absorbing advantages of Non-Local Networks (NLNet) and Squeeze-Excitation Networks (SENet), GCNet provides a simple, fast and effective approach for global context modeling, which generally outperforms both NLNet and SENet on major benchmarks for various recognition tasks.

Citing GCNet

@article{cao2019GCNet,
  title={GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond},
  author={Cao, Yue and Xu, Jiarui and Lin, Stephen and Wei, Fangyun and Hu, Han},
  journal={arXiv preprint arXiv:1904.11492},
  year={2019}
}

Main Results

Results on R50-FPN with backbone (fixBN)

Back-bone	Model	Back-bone Norm	Heads	Context	Lr schd	Mem (GB)	Train time (s/iter)	Inf time (fps)	box AP	mask AP	Download
R50-FPN	Mask	fixBN	2fc (w/o BN)	-	1x	3.9	0.453	10.6	37.3	34.2	model
R50-FPN	Mask	fixBN	2fc (w/o BN)	GC(c3-c5, r16)	1x	4.5	0.533	10.1	38.5	35.1	model
R50-FPN	Mask	fixBN	2fc (w/o BN)	GC(c3-c5, r4)	1x	4.6	0.533	9.9	38.9	35.5	model
R50-FPN	Mask	fixBN	2fc (w/o BN)	-	2x	-	-	-	38.2	34.9	model
R50-FPN	Mask	fixBN	2fc (w/o BN)	GC(c3-c5, r16)	2x	-	-	-	39.7	36.1	model
R50-FPN	Mask	fixBN	2fc (w/o BN)	GC(c3-c5, r4)	2x	-	-	-	40.0	36.2	model

Results on R50-FPN with backbone (syncBN)

Back-bone	Model	Back-bone Norm	Heads	Context	Lr schd	Mem (GB)	Train time (s/iter)	Inf time (fps)	box AP	mask AP	Download
R50-FPN	Mask	SyncBN	2fc (w/o BN)	-	1x	3.9	0.543	10.2	37.2	33.8	model
R50-FPN	Mask	SyncBN	2fc (w/o BN)	GC(c3-c5, r16)	1x	4.5	0.547	9.9	39.4	35.7	model
R50-FPN	Mask	SyncBN	2fc (w/o BN)	GC(c3-c5, r4)	1x	4.6	0.603	9.4	39.9	36.2	model
R50-FPN	Mask	SyncBN	2fc (w/o BN)	-	2x	3.9	0.543	10.2	37.7	34.3	model
R50-FPN	Mask	SyncBN	2fc (w/o BN)	GC(c3-c5, r16)	2x	4.5	0.547	9.9	39.7	36.0	model
R50-FPN	Mask	SyncBN	2fc (w/o BN)	GC(c3-c5, r4)	2x	4.6	0.603	9.4	40.2	36.3	model
R50-FPN	Mask	SyncBN	4conv1fc (SyncBN)	-	1x	-	-	-	38.8	34.6	model
R50-FPN	Mask	SyncBN	4conv1fc (SyncBN)	GC(c3-c5, r16)	1x	-	-	-	41.0	36.5	model
R50-FPN	Mask	SyncBN	4conv1fc (SyncBN)	GC(c3-c5, r4)	1x	-	-	-	41.4	37.0	model

Results on stronger backbones

Back-bone	Model	Back-bone Norm	Heads	Context	Lr schd	Mem (GB)	Train time (s/iter)	Inf time (fps)	box AP	mask AP	Download
R101-FPN	Mask	fixBN	2fc (w/o BN)	-	1x	5.8	0.571	9.5	39.4	35.9	model
R101-FPN	Mask	fixBN	2fc (w/o BN)	GC(c3-c5, r16)	1x	7.0	0.731	8.6	40.8	37.0	model
R101-FPN	Mask	fixBN	2fc (w/o BN)	GC(c3-c5, r4)	1x	7.1	0.747	8.6	40.8	36.9	model
R101-FPN	Mask	SyncBN	2fc (w/o BN)	-	1x	5.8	0.665	9.2	39.8	36.0	model
R101-FPN	Mask	SyncBN	2fc (w/o BN)	GC(c3-c5, r16)	1x	7.0	0.778	9.0	41.1	37.4	model
R101-FPN	Mask	SyncBN	2fc (w/o BN)	GC(c3-c5, r4)	1x	7.1	0.786	8.9	41.7	37.6	model
X101-FPN	Mask	SyncBN	2fc (w/o BN)	-	1x	7.1	0.912	8.5	41.2	37.3	model
X101-FPN	Mask	SyncBN	2fc (w/o BN)	GC(c3-c5, r16)	1x	8.2	1.055	7.7	42.4	38.0	model
X101-FPN	Mask	SyncBN	2fc (w/o BN)	GC(c3-c5, r4)	1x	8.3	1.037	7.6	42.9	38.5	model
X101-FPN	Cascade Mask	SyncBN	2fc (w/o BN)	-	1x	-	-	-	44.7	38.3	model
X101-FPN	Cascade Mask	SyncBN	2fc (w/o BN)	GC(c3-c5, r16)	1x	-	-	-	45.9	39.3	model
X101-FPN	Cascade Mask	SyncBN	2fc (w/o BN)	GC(c3-c5, r4)	1x	-	-	-	46.5	39.7	model
X101-FPN	DCN Cascade Mask	SyncBN	2fc (w/o BN)	-	1x	-	-	-	47.1	40.4	model
X101-FPN	DCN Cascade Mask	SyncBN	2fc (w/o BN)	GC(c3-c5, r16)	1x	-	-	-	47.9	40.9	model
X101-FPN	DCN Cascade Mask	SyncBN	2fc (w/o BN)	GC(c3-c5, r4)	1x	-	-	-	47.9	40.8	model

Notes

GC denotes Global Context (GC) block is inserted after 1x1 conv of backbone.
DCN denotes replace 3x3 conv with 3x3 Deformable Convolution in c3-c5 stages of backbone.
r4 and r16 denote ratio 4 and ratio 16 in GC block respectively.
Some of models are trained on 4 GPUs with 4 images on each GPU.

Requirements

Linux(tested on Ubuntu 16.04)
Python 3.6+
PyTorch 1.1.0
Cython
apex (Sync BN)

Install

a. Install PyTorch 1.1 and torchvision following the official instructions.

b. Install latest apex with CUDA and C++ extensions following this instructions. The Sync BN implemented by apex is required.

c. Clone the GCNet repository.

 git clone https://github.com/xvjiarui/GCNet.git

d. Compile cuda extensions.

cd GCNet
pip install cython  # or "conda install cython" if you prefer conda
./compile.sh  # or "PYTHON=python3 ./compile.sh" if you use system python3 without virtual environments

e. Install GCNet version mmdetection (other dependencies will be installed automatically).

python(3) setup.py install  # add --user if you want to install it locally
# or "pip install ."

Note: You need to run the last step each time you pull updates from github. Or you can run python(3) setup.py develop or pip install -e . to install mmdetection if you want to make modifications to it frequently.

Please refer to mmdetection install instruction for more details.

Environment

Hardware

8 NVIDIA Tesla V100 GPUs
Intel Xeon 4114 CPU @ 2.20GHz

Software environment

Python 3.6.7
PyTorch 1.1.0
CUDA 9.0
CUDNN 7.0
NCCL 2.3.5

Usage

Train

As in original mmdetection, distributed training is recommended for either single machine or multiple machines.

./tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> [optional arguments]

Supported arguments are:

--validate: perform evaluation every k (default=1) epochs during the training.
--work_dir <WORK_DIR>: if specified, the path in config file will be replaced.

Evaluation

To evaluate trained models, output file is required.

python tools/test.py <CONFIG_FILE> <MODEL_PATH> [optional arguments]

Supported arguments are:

--gpus: number of GPU used for evaluation
--out: output file name, usually ends wiht .pkl
--eval: type of evaluation need, for mask-rcnn, bbox segm would evaluate both bounding box and mask AP.

GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

Related tags

Overview

GCNet for Object Detection

Introduction

Citing GCNet

Main Results

Results on R50-FPN with backbone (fixBN)

Results on R50-FPN with backbone (syncBN)

Results on stronger backbones

Requirements

Install

Environment

Hardware

Software environment

Usage

Train

Evaluation

Owner

Jerry Jiarui XU

Change Detection in SAR Images Based on Multiscale Capsule Network

Syntax-Aware Action Targeting for Video Captioning

Official pytorch implementation of the IrwGAN for unaligned image-to-image translation

Official Pytorch implementation of ICLR 2018 paper Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge.

Structural Constraints on Information Content in Human Brain States

Optical Character Recognition + Instance Segmentation for russian and english languages

DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Graph-total-spanning-trees - A Python script to get total number of Spanning Trees in a Graph

Advbox is a toolbox to generate adversarial examples that fool neural networks in PaddlePaddle、PyTorch、Caffe2、MxNet、Keras、TensorFlow and Advbox can benchmark the robustness of machine learning models.

FS-Mol: A Few-Shot Learning Dataset of Molecules

Google AI Open Images - Object Detection Track: Open Solution

torchbearer: A model fitting library for PyTorch

Godot RL Agents is a fully Open Source packages that allows video game creators

Fast (simple) spectral synthesis and emission-line fitting of DESI spectra.

Graph Posterior Network: Bayesian Predictive Uncertainty for Node Classification (NeurIPS 2021)

Code for “ACE-HGNN: Adaptive Curvature ExplorationHyperbolic Graph Neural Network”

MLOps will help you to understand how to build a Continuous Integration and Continuous Delivery pipeline for an ML/AI project.

A library to inspect itermediate layers of PyTorch models.

FairMOT for Multi-Class MOT using YOLOX as Detector