End-to-End Object Detection with Fully Convolutional Network

Overview

End-to-End Object Detection with Fully Convolutional Network

GitHub

This project provides an implementation for "End-to-End Object Detection with Fully Convolutional Network" on PyTorch.

Experiments in the paper were conducted on the internal framework, thus we reimplement them on cvpods and report details as below.

Requirements

Get Started

  • install cvpods locally (requires cuda to compile)
python3 -m pip install 'git+https://github.com/Megvii-BaseDetection/cvpods.git'
# (add --user if you don't have permission)

# Or, to install it from a local clone:
git clone https://github.com/Megvii-BaseDetection/cvpods.git
python3 -m pip install -e cvpods

# Or,
pip install -r requirements.txt
python3 setup.py build develop
  • prepare datasets
cd /path/to/cvpods
cd datasets
ln -s /path/to/your/coco/dataset coco
  • Train & Test
git clone https://github.com/Megvii-BaseDetection/DeFCN.git
cd DeFCN/playground/detection/coco/poto.res50.fpn.coco.800size.3x_ms  # for example

# Train
pods_train --num-gpus 8

# Test
pods_test --num-gpus 8 \
    MODEL.WEIGHTS /path/to/your/save_dir/ckpt.pth # optional
    OUTPUT_DIR /path/to/your/save_dir # optional

# Multi node training
## sudo apt install net-tools ifconfig
pods_train --num-gpus 8 --num-machines N --machine-rank 0/1/.../N-1 --dist-url "tcp://MASTER_IP:port"

Results on COCO2017 val set

model assignment with NMS lr sched. mAP mAR download
FCOS one-to-many Yes 3x + ms 41.4 59.1 weight | log
FCOS baseline one-to-many Yes 3x + ms 40.9 58.4 weight | log
Anchor one-to-one No 3x + ms 37.1 60.5 weight | log
Center one-to-one No 3x + ms 35.2 61.0 weight | log
Foreground Loss one-to-one No 3x + ms 38.7 62.2 weight | log
POTO one-to-one No 3x + ms 39.2 61.7 weight | log
POTO + 3DMF one-to-one No 3x + ms 40.6 61.6 weight | log
POTO + 3DMF + Aux mixture* No 3x + ms 41.4 61.5 weight | log

* We adopt a one-to-one assignment in POTO and a one-to-many assignment in the auxiliary loss, respectively.

  • 2x + ms schedule is adopted in the paper, but we adopt 3x + ms schedule here to achieve higher performance.
  • It's normal to observe ~0.3AP noise in POTO.

Results on CrowdHuman val set

model assignment with NMS lr sched. AP50 mMR recall download
FCOS one-to-many Yes 30k iters 86.1 54.9 94.2 weight | log
ATSS one-to-many Yes 30k iters 87.2 49.7 94.0 weight | log
POTO one-to-one No 30k iters 88.5 52.2 96.3 weight | log
POTO + 3DMF one-to-one No 30k iters 88.8 51.0 96.6 weight | log
POTO + 3DMF + Aux mixture* No 30k iters 89.1 48.9 96.5 weight | log

* We adopt a one-to-one assignment in POTO and a one-to-many assignment in the auxiliary loss, respectively.

  • It's normal to observe ~0.3AP noise in POTO, and ~1.0mMR noise in all methods.

Ablations on COCO2017 val set

model assignment with NMS lr sched. mAP mAR note
POTO one-to-one No 6x + ms 40.0 61.9
POTO one-to-one No 9x + ms 40.2 62.3
POTO one-to-one No 3x + ms 39.2 61.1 replace Hungarian algorithm by argmax
POTO + 3DMF one-to-one No 3x + ms 40.9 62.0 remove GN in 3DMF
POTO + 3DMF + Aux mixture* No 3x + ms 41.5 61.5 remove GN in 3DMF

* We adopt a one-to-one assignment in POTO and a one-to-many assignment in the auxiliary loss, respectively.

  • For one-to-one assignment, more training iters lead to higher performance.
  • The argmax (also known as top-1) operation is indeed the approximate solution of bipartite matching in dense prediction methods.
  • It seems harmless to remove GN in 3DMF, which also leads to higher inference speed.

Acknowledgement

This repo is developed based on cvpods. Please check cvpods for more details and features.

License

This repo is released under the Apache 2.0 license. Please see the LICENSE file for more information.

Citing

If you use this work in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@article{wang2020end,
  title   =  {End-to-End Object Detection with Fully Convolutional Network},
  author  =  {Wang, Jianfeng and Song, Lin and Li, Zeming and Sun, Hongbin and Sun, Jian and Zheng, Nanning},
  journal =  {arXiv preprint arXiv:2012.03544},
  year    =  {2020}
}

Contributing to the project

Any pull requests or issues about the implementation are welcome. If you have any issue about the library (e.g. installation, environments), please refer to cvpods.

Owner
BaseDetection Team of Megvii
Implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Deep Unsupervised Image Hashing by Maximizing Bit Entropy This is the PyTorch implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hash

62 Dec 30, 2022
Official repository for: Continuous Control With Ensemble DeepDeterministic Policy Gradients

Continuous Control With Ensemble Deep Deterministic Policy Gradients This repository is the official implementation of Continuous Control With Ensembl

4 Dec 06, 2021
Semi-Autoregressive Transformer for Image Captioning

Semi-Autoregressive Transformer for Image Captioning Requirements Python 3.6 Pytorch 1.6 Prepare data Please use git clone --recurse-submodules to clo

YE Zhou 23 Dec 09, 2022
571 Dec 25, 2022
Classifies galaxy morphology with Bayesian CNN

Zoobot Zoobot classifies galaxy morphology with deep learning. This code will let you: Reproduce and improve the Galaxy Zoo DECaLS automated classific

Mike Walmsley 39 Dec 20, 2022
Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

Optimum Transformers Accelerated NLP pipelines for fast inference 🚀 on CPU and GPU. Built with 🤗 Transformers, Optimum and ONNX runtime. Installatio

Aleksey Korshuk 115 Dec 16, 2022
OpenMMLab 3D Human Parametric Model Toolbox and Benchmark

Introduction English | 简体中文 MMHuman3D is an open source PyTorch-based codebase for the use of 3D human parametric models in computer vision and comput

OpenMMLab 782 Jan 04, 2023
Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

Multi-Task Framework for Cross-Lingual Abstractive Summarization (MCLAS) The code for ACL2021 paper Cross-Lingual Abstractive Summarization with Limit

Yu Bai 43 Nov 07, 2022
Backend code to use MCPI's python API to make infinite worlds with custom generation

inf-mcpi Backend code to use MCPI's python API to make infinite worlds with custom generation Does not save player-placed blocks! Generation is still

5 Oct 04, 2022
Implementation of "Semi-supervised Domain Adaptive Structure Learning"

Semi-supervised Domain Adaptive Structure Learning - ASDA This repo contains the source code and dataset for our ASDA paper. Illustration of the propo

3 Dec 13, 2021
Implementation of the bachelor's thesis "Real-time stock predictions with deep learning and news scraping".

Real-time stock predictions with deep learning and news scraping This repository contains a partial implementation of my bachelor's thesis "Real-time

David Álvarez de la Torre 0 Feb 09, 2022
Generating Anime Images by Implementing Deep Convolutional Generative Adversarial Networks paper

AnimeGAN - Deep Convolutional Generative Adverserial Network PyTorch implementation of DCGAN introduced in the paper: Unsupervised Representation Lear

Rohit Kukreja 23 Jul 21, 2022
Autoencoders pretraining using clustering

Autoencoders pretraining using clustering

IITiS PAN 2 Dec 16, 2021
Multi-Task Learning as a Bargaining Game

Nash-MTL Official implementation of "Multi-Task Learning as a Bargaining Game". Setup environment conda create -n nashmtl python=3.9.7 conda activate

Aviv Navon 87 Dec 26, 2022
Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible

Python script that analyses the given datasets and comes up with the best polynomial regression representation with the smallest polynomial degree possible, to be the most reliable with the least com

Nikolas B Virionis 2 Aug 01, 2022
Official re-implementation of the Calibrated Adversarial Refinement model described in the paper Calibrated Adversarial Refinement for Stochastic Semantic Segmentation

Official re-implementation of the Calibrated Adversarial Refinement model described in the paper Calibrated Adversarial Refinement for Stochastic Semantic Segmentation

Elias Kassapis 31 Nov 22, 2022
Moiré Attack (MA): A New Potential Risk of Screen Photos [NeurIPS 2021]

Moiré Attack (MA): A New Potential Risk of Screen Photos [NeurIPS 2021] This repository is the official implementation of Moiré Attack (MA): A New Pot

Dantong Niu 22 Dec 24, 2022
App customer segmentation cohort rfm clustering

CUSTOMER SEGMENTATION COHORT RFM CLUSTERING TỔNG QUAN VỀ HỆ THỐNG DỮ LIỆU Nên chuyển qua theme màu dark thì sẽ nhìn đẹp hơn https://customer-segmentat

hieulmsc 3 Dec 18, 2021
This is official implementaion of paper "Token Shift Transformer for Video Classification".

This is official implementaion of paper "Token Shift Transformer for Video Classification". We achieve SOTA performance 80.40% on Kinetics-400 val. Paper link

VideoNet 60 Dec 30, 2022
3D Pose Estimation for Vehicles

3D Pose Estimation for Vehicles Introduction This work generates 4 key-points and 2 key-edges from vertices and edges of vehicles as ground truth. The

Jingyi Wang 1 Nov 01, 2021