Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

Last update: Jan 02, 2023

Related tags

Deep Learning Mask2Former

Overview

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation

Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar

[arXiv] [Project] [BibTeX]

Features

A single architecture for panoptic, instance and semantic segmentation.
Support major segmentation datasets: ADE20K, Cityscapes, COCO, Mapillary Vistas.

Installation

See installation instructions.

Getting Started

See Preparing Datasets for Mask2Former.

See Getting Started with Mask2Former.

Advanced usage

See Advanced Usage of Mask2Former.

Model Zoo and Baselines

We provide a large set of baseline results and trained models available for download in the Mask2Former Model Zoo.

License

Shield:

The majority of Mask2Former is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

However portions of the project are available under separate license terms: Swin-Transformer-Semantic-Segmentation is licensed under the MIT license, Deformable-DETR is licensed under the Apache-2.0 License.

Citing Mask2Former

If you use Mask2Former in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.

@article{cheng2021mask2former,
  title={Masked-attention Mask Transformer for Universal Image Segmentation},
  author={Bowen Cheng and Ishan Misra and Alexander G. Schwing and Alexander Kirillov and Rohit Girdhar},
  journal={arXiv},
  year={2021}
}

If you find the code useful, please also consider the following BibTeX entry.

@inproceedings{cheng2021maskformer,
  title={Per-Pixel Classification is Not All You Need for Semantic Segmentation},
  author={Bowen Cheng and Alexander G. Schwing and Alexander Kirillov},
  journal={NeurIPS},
  year={2021}
}

Acknowledgement

Code is largely based on MaskFormer (https://github.com/facebookresearch/MaskFormer).

Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

Related tags

Overview

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation

Features

Installation

Getting Started

Advanced usage

Model Zoo and Baselines

License

Citing Mask2Former

Acknowledgement

Owner

Meta Research

Prior-Guided Multi-View 3D Head Reconstruction

Neural Motion Learner With Python

Code for Blind Image Decomposition (BID) and Blind Image Decomposition network (BIDeN).

I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)

A Large Scale Benchmark for Individual Treatment Effect Prediction and Uplift Modeling

PyTorch reimplementation of hand-biomechanical-constraints (ECCV2020)

Code for our NeurIPS 2021 paper: Sparsely Changing Latent States for Prediction and Planning in Partially Observable Domains

Official repository for Automated Learning Rate Scheduler for Large-Batch Training (8th ICML Workshop on AutoML)

TorchX: A PyTorch Extension Library for More Efficient Deep Learning

Applications using the GTN library and code to reproduce experiments in "Differentiable Weighted Finite-State Transducers"

Get 2D point positions (e.g., facial landmarks) projected on 3D mesh

Jupyter notebooks showing best practices for using cx_Oracle, the Python DB API for Oracle Database

Automated Melanoma Recognition in Dermoscopy Images via Very Deep Residual Networks

Employs neural networks to classify images into four categories: ship, automobile, dog or frog

The pyrelational package offers a flexible workflow to enable active learning with as little change to the models and datasets as possible

MetaBalance: High-Performance Neural Networks for Class-Imbalanced Data

Irrigation controller for Home Assistant

Pytorch implementation of our paper accepted by NeurIPS 2021 -- Revisiting Discriminator in GAN Compression: A Generator-discriminator Cooperative Compression Scheme

Code for Robust Contrastive Learning against Noisy Views

Keras Image Embeddings using Contrastive Loss