QueryInst: Parallelly Supervised Mask Query for Instance Segmentation

Overview

QueryInst: Parallelly Supervised Mask Query for Instance Segmentation

  • TL;DR: QueryInst is a simple and effective query based instance segmentation method driven by parallel supervision on dynamic mask heads, which outperforms previous arts in terms of both accuracy and speed.

QueryInst: Parallelly Supervised Mask Query for Instance Segmentation,

by Yuxin Fang*, Shusheng Yang*, Xinggang Wang†, Yu Li, Chen Fang, Ying Shan, Bin Feng, Wenyu Liu.

(*) equal contribution, (†) corresponding author.

arXiv technical report (arXiv 2105.01928)

QueryInst

  • This repo serves as the official implementation for QueryInst, based on mmdetection and built upon Sparse R-CNN & DETR. Implantations based on Detectron2 will be released in the near future.

  • This project is under active development, we will extend QueryInst to a wide range of instance-level recognition tasks.

Updates

[06/05/2021] 🌟 QueryInst training and inference code has been released!

Getting Started

python setup.py develop
  • Prepare datasets:
mkdir data && cd data
ln -s /path/to/coco coco
  • Training QueryInst with single GPU:
python tools/train.py configs/queryinst/queryinst_r50_fpn_1x_coco.py
  • Training QueryInst with multi GPUs:
./tools/dist_train.sh configs/queryinst/queryinst_r50_fpn_1x_coco.py 8
  • Test QueryInst on COCO val set with single GPU:
python tools/test.py configs/queryinst/queryinst_r50_fpn_1x_coco.py PATH/TO/CKPT.pth --eval bbox segm
  • Test QueryInst on COCO val set with multi GPUs:
./tools/dist_test.sh configs/queryinst/queryinst_r50_fpn_1x_coco.py PATH/TO/CKPT.pth 8 --eval bbox segm

Main Results on COCO val

Configs Aug. Weights Box AP Mask AP
QueryInst_R50_3x_300_queries 480 ~ 800, w/ Crop - 46.9 41.4
QueryInst_R101_3x_300_queries 480 ~ 800, w/ Crop - 48.0 42.4
QueryInst_X101-DCN_3x_300_queries 480 ~ 800, w/ Crop - 50.3 44.2

Citation

If you find our paper and code useful in your research, please consider giving a star and citation ?? :

@article{QueryInst,
  title={QueryInst: Parallelly Supervised Mask Query for Instance Segmentation},
  author={Fang, Yuxin and Yang, Shusheng and Wang, Xinggang and Li, Yu and Fang, Chen and Shan, Ying and Feng, Bin and Liu, Wenyu},
  journal={arXiv preprint arXiv:2105.01928},
  year={2021}
}

TODO

  • QueryInst training and inference code.
  • QueryInst based on Detectron2 toolbox will be released in the near future.
  • QueryInst configurations for Cityscapes and YouTube-VIS.
  • QueryInst pretrain weights.
Owner
Hust Visual Learning Team
Hust Visual Learning Team belongs to the Artificial Intelligence Research Institute in the School of EIC in HUST
Hust Visual Learning Team
Information Gain Filtration (IGF) is a method for filtering domain-specific data during language model finetuning. IGF shows significant improvements over baseline fine-tuning without data filtration.

Information Gain Filtration Information Gain Filtration (IGF) is a method for filtering domain-specific data during language model finetuning. IGF sho

4 Jul 28, 2022
Ray tracing of a Schwarzschild black hole written entirely in TensorFlow.

TensorGeodesic Ray tracing of a Schwarzschild black hole written entirely in TensorFlow. Dependencies: Python 3 TensorFlow 2.x numpy matplotlib About

5 Jan 15, 2022
Real-time analysis of intracranial neurophysiology recordings.

py_neuromodulation Click this button to run the "Tutorial ML with py_neuro" notebooks: The py_neuromodulation toolbox allows for real time capable pro

Interventional Cognitive Neuromodulation - Neumann Lab Berlin 15 Nov 03, 2022
Embracing Single Stride 3D Object Detector with Sparse Transformer

SST: Single-stride Sparse Transformer This is the official implementation of paper: Embracing Single Stride 3D Object Detector with Sparse Transformer

TuSimple 385 Dec 28, 2022
Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper

Ponder(ing) Transformer Implementation of a Transformer that learns to adapt the number of computational steps it takes depending on the difficulty of

Phil Wang 65 Oct 04, 2022
Official repository of ICCV21 paper "Viewpoint Invariant Dense Matching for Visual Geolocalization"

Viewpoint Invariant Dense Matching for Visual Geolocalization: PyTorch implementation This is the implementation of the ICCV21 paper: G Berton, C. Mas

Gabriele Berton 44 Jan 03, 2023
improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

CLIP-ViL In our paper "How Much Can CLIP Benefit Vision-and-Language Tasks?", we show the improvement of CLIP features over the traditional resnet fea

310 Dec 28, 2022
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

Multimodal Deep Learning 🎆 🎆 🎆 Announcing the multimodal deep learning repository that contains implementation of various deep learning-based model

Deep Cognition and Language Research (DeCLaRe) Lab 398 Dec 30, 2022
TargetAllDomainObjects - A python wrapper to run a command on against all users/computers/DCs of a Windows Domain

TargetAllDomainObjects A python wrapper to run a command on against all users/co

Podalirius 19 Dec 13, 2022
Chinese Advertisement Board Identification(Pytorch)

Chinese-Advertisement-Board-Identification. We use YoloV5 to extract the ROI of the location of the chinese word. Next, we sort the bounding box and recognize every chinese words which we extracted.

Li-Wei Hsiao 12 Jul 21, 2022
Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding (CVPR2022)

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding by Qiaole Dong*, Chenjie Cao*, Yanwei Fu Paper and Supple

Qiaole Dong 190 Dec 27, 2022
The devkit of the nuScenes dataset.

nuScenes devkit Welcome to the devkit of the nuScenes and nuImages datasets. Overview Changelog Devkit setup nuImages nuImages setup Getting started w

Motional 1.6k Jan 05, 2023
A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data

A Parameter-free Deep Embedded Clustering Method for Single-cell RNA-seq Data Overview Clustering analysis is widely utilized in single-cell RNA-seque

AI-Biomed @NSCC-gz 3 May 08, 2022
No Code AI/ML platform

NoCodeAIML No Code AI/ML platform - Community Edition Video credits: Uday Kiran Typical No Code AI/ML Platform will have features like drag and drop,

Bhagvan Kommadi 5 Jan 28, 2022
FMA: A Dataset For Music Analysis

FMA: A Dataset For Music Analysis Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson. International Society for Music Information

Michaël Defferrard 1.8k Dec 29, 2022
Honours project, on creating a depth estimation map from two stereo images of featureless regions

image-processing This module generates depth maps for shape-blocked-out images Install If working with anaconda, then from the root directory: conda e

2 Oct 17, 2022
🤗 Paper Style Guide

🤗 Paper Style Guide (Work in progress, send a PR!) Libraries to Know booktabs natbib cleveref Either seaborn, plotly or altair for graphs algorithmic

Hugging Face 66 Dec 12, 2022
​ This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

PAMA This is the Pytorch implementation of Progressive Attentional Manifold Alignment. Requirements python 3.6 pytorch 1.2.0+ PIL, numpy, matplotlib C

98 Nov 15, 2022
Train CPPNs as a Generative Model, using Generative Adversarial Networks and Variational Autoencoder techniques to produce high resolution images.

cppn-gan-vae tensorflow Train Compositional Pattern Producing Network as a Generative Model, using Generative Adversarial Networks and Variational Aut

hardmaru 343 Dec 29, 2022
[ICCV 2021] Official Pytorch implementation for Discriminative Region-based Multi-Label Zero-Shot Learning SOTA results on NUS-WIDE and OpenImages

Discriminative Region-based Multi-Label Zero-Shot Learning (ICCV 2021) [arXiv][Project page coming soon] Sanath Narayan*, Akshita Gupta*, Salman Kh

Akshita Gupta 54 Nov 21, 2022