GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

Overview

GCNet for Object Detection

PWC PWC PWC PWC

By Yue Cao, Jiarui Xu, Stephen Lin, Fangyun Wei, Han Hu.

This repo is a official implementation of "GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond" on COCO object detection based on open-mmlab's mmdetection. The core operator GC block could be find here. Many thanks to mmdetection for their simple and clean framework.

Update on 2020/12/07

The extension of GCNet got accepted by TPAMI (PDF).

Update on 2019/10/28

GCNet won the Best Paper Award at ICCV 2019 Neural Architects Workshop!

Update on 2019/07/01

The code is refactored. More results are provided and all configs could be found in configs/gcnet.

Notes: Both PyTorch official SyncBN and Apex SyncBN have some stability issues. During training, mAP may drops to zero and back to normal during last few epochs.

Update on 2019/06/03

GCNet is supported by the official mmdetection repo here. Thanks again for open-mmlab's work on open source projects.

Introduction

GCNet is initially described in arxiv. Via absorbing advantages of Non-Local Networks (NLNet) and Squeeze-Excitation Networks (SENet), GCNet provides a simple, fast and effective approach for global context modeling, which generally outperforms both NLNet and SENet on major benchmarks for various recognition tasks.

Citing GCNet

@article{cao2019GCNet,
  title={GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond},
  author={Cao, Yue and Xu, Jiarui and Lin, Stephen and Wei, Fangyun and Hu, Han},
  journal={arXiv preprint arXiv:1904.11492},
  year={2019}
}

Main Results

Results on R50-FPN with backbone (fixBN)

Back-bone Model Back-bone Norm Heads Context Lr schd Mem (GB) Train time (s/iter) Inf time (fps) box AP mask AP Download
R50-FPN Mask fixBN 2fc (w/o BN) - 1x 3.9 0.453 10.6 37.3 34.2 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r16) 1x 4.5 0.533 10.1 38.5 35.1 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r4) 1x 4.6 0.533 9.9 38.9 35.5 model
R50-FPN Mask fixBN 2fc (w/o BN) - 2x - - - 38.2 34.9 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r16) 2x - - - 39.7 36.1 model
R50-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r4) 2x - - - 40.0 36.2 model

Results on R50-FPN with backbone (syncBN)

Back-bone Model Back-bone Norm Heads Context Lr schd Mem (GB) Train time (s/iter) Inf time (fps) box AP mask AP Download
R50-FPN Mask SyncBN 2fc (w/o BN) - 1x 3.9 0.543 10.2 37.2 33.8 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x 4.5 0.547 9.9 39.4 35.7 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x 4.6 0.603 9.4 39.9 36.2 model
R50-FPN Mask SyncBN 2fc (w/o BN) - 2x 3.9 0.543 10.2 37.7 34.3 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 2x 4.5 0.547 9.9 39.7 36.0 model
R50-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 2x 4.6 0.603 9.4 40.2 36.3 model
R50-FPN Mask SyncBN 4conv1fc (SyncBN) - 1x - - - 38.8 34.6 model
R50-FPN Mask SyncBN 4conv1fc (SyncBN) GC(c3-c5, r16) 1x - - - 41.0 36.5 model
R50-FPN Mask SyncBN 4conv1fc (SyncBN) GC(c3-c5, r4) 1x - - - 41.4 37.0 model

Results on stronger backbones

Back-bone Model Back-bone Norm Heads Context Lr schd Mem (GB) Train time (s/iter) Inf time (fps) box AP mask AP Download
R101-FPN Mask fixBN 2fc (w/o BN) - 1x 5.8 0.571 9.5 39.4 35.9 model
R101-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r16) 1x 7.0 0.731 8.6 40.8 37.0 model
R101-FPN Mask fixBN 2fc (w/o BN) GC(c3-c5, r4) 1x 7.1 0.747 8.6 40.8 36.9 model
R101-FPN Mask SyncBN 2fc (w/o BN) - 1x 5.8 0.665 9.2 39.8 36.0 model
R101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x 7.0 0.778 9.0 41.1 37.4 model
R101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x 7.1 0.786 8.9 41.7 37.6 model
X101-FPN Mask SyncBN 2fc (w/o BN) - 1x 7.1 0.912 8.5 41.2 37.3 model
X101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x 8.2 1.055 7.7 42.4 38.0 model
X101-FPN Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x 8.3 1.037 7.6 42.9 38.5 model
X101-FPN Cascade Mask SyncBN 2fc (w/o BN) - 1x - - - 44.7 38.3 model
X101-FPN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x - - - 45.9 39.3 model
X101-FPN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x - - - 46.5 39.7 model
X101-FPN DCN Cascade Mask SyncBN 2fc (w/o BN) - 1x - - - 47.1 40.4 model
X101-FPN DCN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r16) 1x - - - 47.9 40.9 model
X101-FPN DCN Cascade Mask SyncBN 2fc (w/o BN) GC(c3-c5, r4) 1x - - - 47.9 40.8 model

Notes

  • GC denotes Global Context (GC) block is inserted after 1x1 conv of backbone.
  • DCN denotes replace 3x3 conv with 3x3 Deformable Convolution in c3-c5 stages of backbone.
  • r4 and r16 denote ratio 4 and ratio 16 in GC block respectively.
  • Some of models are trained on 4 GPUs with 4 images on each GPU.

Requirements

  • Linux(tested on Ubuntu 16.04)
  • Python 3.6+
  • PyTorch 1.1.0
  • Cython
  • apex (Sync BN)

Install

a. Install PyTorch 1.1 and torchvision following the official instructions.

b. Install latest apex with CUDA and C++ extensions following this instructions. The Sync BN implemented by apex is required.

c. Clone the GCNet repository.

 git clone https://github.com/xvjiarui/GCNet.git 

d. Compile cuda extensions.

cd GCNet
pip install cython  # or "conda install cython" if you prefer conda
./compile.sh  # or "PYTHON=python3 ./compile.sh" if you use system python3 without virtual environments

e. Install GCNet version mmdetection (other dependencies will be installed automatically).

python(3) setup.py install  # add --user if you want to install it locally
# or "pip install ."

Note: You need to run the last step each time you pull updates from github. Or you can run python(3) setup.py develop or pip install -e . to install mmdetection if you want to make modifications to it frequently.

Please refer to mmdetection install instruction for more details.

Environment

Hardware

  • 8 NVIDIA Tesla V100 GPUs
  • Intel Xeon 4114 CPU @ 2.20GHz

Software environment

  • Python 3.6.7
  • PyTorch 1.1.0
  • CUDA 9.0
  • CUDNN 7.0
  • NCCL 2.3.5

Usage

Train

As in original mmdetection, distributed training is recommended for either single machine or multiple machines.

./tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> [optional arguments]

Supported arguments are:

  • --validate: perform evaluation every k (default=1) epochs during the training.
  • --work_dir <WORK_DIR>: if specified, the path in config file will be replaced.

Evaluation

To evaluate trained models, output file is required.

python tools/test.py <CONFIG_FILE> <MODEL_PATH> [optional arguments]

Supported arguments are:

  • --gpus: number of GPU used for evaluation
  • --out: output file name, usually ends wiht .pkl
  • --eval: type of evaluation need, for mask-rcnn, bbox segm would evaluate both bounding box and mask AP.
Owner
Jerry Jiarui XU
Part of the journey is the end
Jerry Jiarui XU
Wileless-PDGNet Implementation

Wileless-PDGNet Implementation This repo is related to the following paper: Boning Li, Ananthram Swami, and Santiago Segarra, "Power allocation for wi

6 Oct 04, 2022
Sum-Product Probabilistic Language

Sum-Product Probabilistic Language SPPL is a probabilistic programming language that delivers exact solutions to a broad range of probabilistic infere

MIT Probabilistic Computing Project 57 Nov 17, 2022
Experiments and examples converting Transformers to ONNX

Experiments and examples converting Transformers to ONNX This repository containes experiments and examples on converting different Transformers to ON

Philipp Schmid 4 Dec 24, 2022
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Chao Ma 3k Jan 03, 2023
Show-attend-and-tell - TensorFlow Implementation of "Show, Attend and Tell"

Show, Attend and Tell Update (December 2, 2016) TensorFlow implementation of Show, Attend and Tell: Neural Image Caption Generation with Visual Attent

Yunjey Choi 902 Nov 29, 2022
MIRACLE (Missing data Imputation Refinement And Causal LEarning)

MIRACLE (Missing data Imputation Refinement And Causal LEarning) Code Author: Trent Kyono This repository contains the code used for the "MIRACLE: Cau

van_der_Schaar \LAB 15 Dec 29, 2022
A Pytorch implementation of MoveNet from Google. Include training code and pre-train model.

Movenet.Pytorch Intro MoveNet is an ultra fast and accurate model that detects 17 keypoints of a body. This is A Pytorch implementation of MoveNet fro

Mr.Fire 241 Dec 26, 2022
AAAI 2022 paper - Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction

AT-BMC Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction (AAAI 2022) Paper Prerequisites Install pac

16 Nov 26, 2022
Source code of CIKM2021 Long Paper "PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling".

PSSL Source code of CIKM2021 Long Paper "PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling". It consists of the pre-tra

2 Dec 21, 2021
MacroTools provides a library of tools for working with Julia code and expressions.

MacroTools.jl MacroTools provides a library of tools for working with Julia code and expressions. This includes a powerful template-matching system an

FluxML 278 Dec 11, 2022
Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Simple-Image-Classification Simple Image Classification Code (PyTorch) Yechan Kim This repository contains: Python3 / Pytorch code for multi-class ima

Yechan Kim 8 Oct 29, 2022
Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes (CVPR2021)

RSCD (BS-RSCD & JCD) Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes (CVPR2021) by Zhihang Zhong, Yinqiang Zheng, Imari Sato We co

81 Dec 15, 2022
Deep learning for spiking neural networks

A deep learning library for spiking neural networks. Norse aims to exploit the advantages of bio-inspired neural components, which are sparse and even

Electronic Vision(s) Group — BrainScaleS Neuromorphic Hardware 59 Nov 28, 2022
DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)

DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)

Jason Antic 15.8k Jan 04, 2023
Source Code For Template-Based Named Entity Recognition Using BART

Template-Based NER Source Code For Template-Based Named Entity Recognition Using BART Training Training train.py Inference inference.py Corpus ATIS (h

174 Dec 19, 2022
MediaPipeで姿勢推定を行い、Tokyo2020オリンピック風のピクトグラムを表示するデモ

Tokyo2020-Pictogram-using-MediaPipe MediaPipeで姿勢推定を行い、Tokyo2020オリンピック風のピクトグラムを表示するデモです。 Tokyo2020Pictgram02.mp4 Requirement mediapipe 0.8.6 or later O

KazuhitoTakahashi 295 Dec 26, 2022
Official Implementation of "Learning Disentangled Behavior Embeddings"

DBE: Disentangled-Behavior-Embedding Official implementation of Learning Disentangled Behavior Embeddings (NeurIPS 2021). Environment requirement The

Mishne Lab 12 Sep 28, 2022
Awesome Monocular 3D detection

Awesome Monocular 3D detection Paper list of 3D detetction, keep updating! Contents Paper List 2022 2021 2020 2019 2018 2017 2016 KITTI Results Paper

Zhikang Zou 184 Jan 04, 2023
Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

WECHSEL Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models. arXiv: https://arx

Institute of Computational Perception 45 Dec 29, 2022
This is a tensorflow-based rotation detection benchmark, also called AlphaRotate.

AlphaRotate: A Rotation Detection Benchmark using TensorFlow Abstract AlphaRotate is maintained by Xue Yang with Shanghai Jiao Tong University supervi

yangxue 972 Jan 05, 2023