DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

Related tags

Deep LearningDCT-Mask
Overview

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

This project hosts the code for implementing the DCT-MASK algorithms for instance segmentation.

[DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation] Xing Shen*, Jirui Yang*, Chunbo Wei, Bing Deng, Jianqiang Huang, Xiansheng Hua Xiaoliang Cheng, Kewei Liang

In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition(CVPR 2021)

arXiv preprint(arXiv:2011.09876)

Contributions

  • We propose a high-quality and low-complexity mask representation for instance segmentation, which encodes the high-resolution binary mask into a compact vector with discrete cosine transform.
  • With slight modifications, DCT-Mask could be integrated into most pixel-based frameworks, and achieve significant and consistent improvement on different datasets, backbones, and training schedules. Specifically, it obtains more improvements for more complex backbones and higher-quality annotations.
  • DCT-Mask does not require extra pre-processing or pre-training. It achieves high-resolution mask prediction at a speed similar to low-resolution.

Installation

Requirements

  • PyTorch ≥ 1.5 and fvcore == 0.1.1.post20200716

This implementation is based on detectron2. Please refer to INSTALL.md. for installation and dataset preparation.

Usage

The codes of this project is on projects/DCT_Mask/

Train with multiple GPUs

cd ./projects/DCT_Mask/
./train1.sh

Testing

cd ./projects/DCT_Mask/
./test1.sh

Model ZOO

Trained models on COCO

Model Backbone Schedule Multi-scale training Inference time (s/im) AP (minival) Link
DCT-Mask R-CNN R50 1x Yes 0.0465 36.5 download(Fetch code: xpdm)
DCT-Mask R-CNN R101 3x Yes 0.0595 39.9 download(Fetch code: 7q6x)
DCT-Mask R-CNN RX101 3x Yes 0.1049 41.2 download(Fetch code: ufw2)
Casecade DCT-Mask R-CNN R50 1x Yes 0.0630 37.5 download(Fetch code: yqxp)
Casecade DCT-Mask R-CNN R101 3x Yes 0.0750 40.8 download(Fetch code: r8xv)
Casecade DCT-Mask R-CNN RX101 3x Yes 0.1195 42.0 download(Fetch code: pdej)

Trained models on Cityscapes

Model Data Backbone Schedule Multi-scale training AP (val) Link
DCT-Mask R-CNN Fine-Only R50 1x Yes 37.0 download(Fetch code: dn7i)
DCT-Mask R-CNN CoCo-Pretrain +Fine R50 1x Yes 39.6 download(Fetch code: ntqf)

Notes

  • We observe about 0.2 AP noise in COCO.
  • High variance observed in CityScapes when trained on fine annotations only. We report the median of 5 runs AP in the article (i.e. 35.6), while in this repo we report the best results (37.0).
  • Initialized from COCO pre-training will reduce the variance on CityScapes as well as increasing mask AP.
  • The inference time is measured on single GPU with batchsize 1. All GPUs are NVIDIA V100.
  • Lvis 0.5 is used for evaluation.

Contributing to the project

Any pull requests or issues are welcome.

If there is any problem with this project, please contact Xing Shen.

Citations

Please consider citing our papers in your publications if the project helps your research.

License

  • MIT License.
Owner
Alibaba Cloud
More Than Just Cloud
Alibaba Cloud
Fine-tuning StyleGAN2 for Cartoon Face Generation

Cartoon-StyleGAN 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation Abstract Recent studies have shown remarkable success in the unsupervised imag

Jihye Back 520 Jan 04, 2023
FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction

FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction. It uses a customized encoder decoder architecture with spatio-temporal convolutions and channel ga

Tarun K 280 Dec 23, 2022
[CVPR 21] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2021.

Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting, CVPR 2021. Ayan Kumar Bhunia, Pinaki nath Chowdhury, Yongxin Yan

Ayan Kumar Bhunia 44 Dec 12, 2022
Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

Joshua Ji 3 Aug 20, 2022
This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

🌈 ERASOR (RA-L'21 with ICRA Option) Official page of "ERASOR: Egocentric Ratio of Pseudo Occupancy-based Dynamic Object Removal for Static 3D Point C

Hyungtae Lim 225 Dec 29, 2022
Pytorch implementation of “Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement”

Graph-to-Graph Transformers Self-attention models, such as Transformer, have been hugely successful in a wide range of natural language processing (NL

Idiap Research Institute 40 Aug 14, 2022
Hands-On Machine Learning for Algorithmic Trading, published by Packt

Hands-On Machine Learning for Algorithmic Trading Hands-On Machine Learning for Algorithmic Trading, published by Packt This is the code repository fo

Packt 981 Dec 29, 2022
Rot-Pro: Modeling Transitivity by Projection in Knowledge Graph Embedding

Rot-Pro : Modeling Transitivity by Projection in Knowledge Graph Embedding This repository contains the source code for the Rot-Pro model, presented a

Tewi 9 Sep 28, 2022
Large scale PTM - PPI relation extraction

Large-scale protein-protein post-translational modification extraction with distant supervision and confidence calibrated BioBERT The silver standard

1 Feb 25, 2022
classify fashion-mnist dataset with pytorch

Fashion-Mnist Classifier with PyTorch Inference 1- clone this repository: git clone https://github.com/Jhamed7/Fashion-Mnist-Classifier.git 2- Instal

1 Jan 14, 2022
Sign Language Translation with Transformers (COLING'2020, ECCV'20 SLRTP Workshop)

transformer-slt This repository gathers data and code supporting the experiments in the paper Better Sign Language Translation with STMC-Transformer.

Kayo Yin 107 Dec 27, 2022
Region-aware Contrastive Learning for Semantic Segmentation, ICCV 2021

Region-aware Contrastive Learning for Semantic Segmentation, ICCV 2021 Abstract Recent works have made great success in semantic segmentation by explo

Hanzhe Hu 30 Dec 29, 2022
code associated with ACL 2021 DExperts paper

DExperts Hi! This repository contains code for the paper DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts to appear at

Alisa Liu 68 Dec 15, 2022
VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

VSR-Transformer By Jiezhang Cao, Yawei Li, Kai Zhang, Luc Van Gool This paper proposes a new Transformer for video super-resolution (called VSR-Transf

Jiezhang Cao 225 Nov 13, 2022
Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks This repository contains a TensorFlow implementation of "

Jingwei Zheng 5 Jan 08, 2023
[IROS'21] SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning

SurRoL IROS 2021 SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning Features dVRK compati

<a href=[email protected]"> 55 Jan 03, 2023
Code repo for "Transformer on a Diet" paper

Transformer on a Diet Reference: C Wang, Z Ye, A Zhang, Z Zhang, A Smola. "Transformer on a Diet". arXiv preprint arXiv (2020). Installation pip insta

cgraywang 31 Sep 26, 2021
✨✨✨An awesome open source toolbox for stereo matching.

OpenStereo This is an awesome open source toolbox for stereo matching. Supported Methods: BM SGM(T-PAMI'07) GCNet(ICCV'17) PSMNet(CVPR'18) StereoNet(E

Wang Qingyu 6 Nov 04, 2022
Evaluating AlexNet features at various depths

Linear Separability Evaluation This repo provides the scripts to test a learned AlexNet's feature representation performance at the five different con

Yuki M. Asano 32 Dec 30, 2022
Codes for "Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier"

Deep-RTC [project page] This repository contains the source code accompanying our ECCV 2020 paper. Solving Long-tailed Recognition with Deep Realistic

Gina Wu 16 May 26, 2022