Per-Pixel Classification is Not All You Need for Semantic Segmentation

Last update: Jan 08, 2023

Related tags

Deep Learning MaskFormer

Overview

MaskFormer: Per-Pixel Classification is Not All You Need for Semantic Segmentation

Bowen Cheng, Alexander G. Schwing, Alexander Kirillov

[arXiv] [Project] [BibTeX]

Features

Better results while being more efficient.
Unified view of semantic- and instance-level segmentation tasks.
Support major semantic segmentation datasets: ADE20K, Cityscapes, COCO-Stuff, Mapillary Vistas.
Support ALL Detectron2 models.

Installation

See installation instructions.

Getting Started

See Preparing Datasets for MaskFormer.

See Getting Started with MaskFormer.

Model Zoo and Baselines

We provide a large set of baseline results and trained models available for download in the MaskFormer Model Zoo.

License

Shield:

The majority of MaskFormer is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

However portions of the project are available under separate license terms: Swin-Transformer-Semantic-Segmentation is licensed under the MIT license.

Citing MaskFormer

If you use MaskFormer in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.

@article{cheng2021maskformer,
  title={Per-Pixel Classification is Not All You Need for Semantic Segmentation},
  author={Bowen Cheng and Alexander G. Schwing and Alexander Kirillov},
  journal={arXiv},
  year={2021}
}

Per-Pixel Classification is Not All You Need for Semantic Segmentation

Related tags

Overview

MaskFormer: Per-Pixel Classification is Not All You Need for Semantic Segmentation

Features

Installation

Getting Started

Model Zoo and Baselines

License

Citing MaskFormer

Owner

Facebook Research

Self-Supervised Methods for Noise-Removal

Created as part of CS50 AI's coursework. This AI makes use of knowledge entailment to calculate the best probabilities to win Minesweeper.

Official implementation of the paper Momentum Capsule Networks (MoCapsNet)

Human pose estimation from video plays a critical role in various applications such as quantifying physical exercises, sign language recognition, and full-body gesture control.

PyTorch implementation for MINE: Continuous-Depth MPI with Neural Radiance Fields

CondNet: Conditional Classifier for Scene Segmentation

Learning to Disambiguate Strongly Interacting Hands via Probabilistic Per-Pixel Part Segmentation [3DV 2021 Oral]

TensorFlow implementation of "Attention is all you need (Transformer)"

dualFace: Two-Stage Drawing Guidance for Freehand Portrait Sketching (CVMJ)

Beginner-friendly repository for Hacktober Fest 2021. Start your contribution to open source through baby steps. 💜

MapReader: A computer vision pipeline for the semantic exploration of maps at scale

Official PyTorch Implementation of Hypercorrelation Squeeze for Few-Shot Segmentation, arXiv 2021

Using Tensorflow Object Detection API to detect Waymo open dataset

ExCon: Explanation-driven Supervised Contrastive Learning

Task-related Saliency Network For Few-shot learning

FADNet++: Real-Time and Accurate Disparity Estimation with Configurable Networks

Rotation Robust Descriptors

SVG Icon processing tool for C++

Code for the paper BERT might be Overkill: A Tiny but Effective Biomedical Entity Linker based on Residual Convolutional Neural Networks

Examples of how to create colorful, annotated equations in Latex using Tikz.