Official PyTorch Implementation of Mask-aware IoU and maYOLACT Detector [BMVC2021]

Overview

The official implementation of Mask-aware IoU and maYOLACT detector. Our implementation is based on mmdetection.

Mask-aware IoU for Anchor Assignment in Real-time Instance Segmentation,
Kemal Oksuz, Baris Can Cam, Fehmi Kahraman, Zeynep Sonat Baltaci, Emre Akbas, Sinan Kalkan, BMVC 2021. (arXiv pre-print)

Summary

Mask-aware IoU: Mask-aware IoU (maIoU) is an IoU variant for better anchor assignment to supervise instance segmentation methods. Unlike the standard IoU, Mask-aware IoU also considers the ground truth masks while assigning a proximity score for an anchor. As a result, for example, if an anchor box overlaps with a ground truth box, but not with the mask of the ground truth, e.g. due to occlusion, then it has a lower score compared to IoU. Please check out the examples below for more insight. Replacing IoU by our maIoU in the state of the art ATSS assigner yields both performance improvement and efficiency (i.e. faster inference) compared to the standard YOLACT method.

maYOLACT Detector: Thanks to the efficiency due to ATSS with maIoU assigner, we incorporate more training tricks into YOLACT, and built maYOLACT Detector which is still real-time but significantly powerful (around 6 AP) than YOLACT. Our best maYOLACT model reaches SOTA performance by 37.7 mask AP on COCO test-dev at 25 fps.

How to Cite

Please cite the paper if you benefit from our paper or the repository:

@inproceedings{maIoU,
       title = {Mask-aware IoU for Anchor Assignment in Real-time Instance Segmentation},
       author = {Kemal Oksuz and Baris Can Cam and Fehmi Kahraman and Zeynep Sonat Baltaci and Sinan Kalkan and Emre Akbas},
       booktitle = {The British Machine Vision Conference (BMCV)},
       year = {2021}
}

Specification of Dependencies and Preparation

  • Please see get_started.md for requirements and installation of mmdetection.
  • Please refer to introduction.md for dataset preparation and basic usage of mmdetection.

Trained Models

Here, we report results in terms of AP (higher better) and oLRP (lower better).

Multi-stage Object Detection

Comparison of Different Assigners (on COCO minival)

Scale Assigner mask AP mask oLRP Log Config Model
400 Fixed IoU 24.8 78.3 log config model
400 ATSS w. IoU 25.3 77.7 log config model
400 ATSS w. maIoU 26.1 77.1 log config model
550 Fixed IoU 28.5 75.2 log config model
550 ATSS w. IoU 29.3 74.5 log config model
550 ATSS w. maIoU 30.4 73.7 log config model
700 Fixed IoU 29.7 74.3 log config model
700 ATSS w. IoU 30.8 73.3 log config model
700 ATSS w. maIoU 31.8 72.5 log config model

maYOLACT Detector (on COCO test-dev)

Scale Backbone mask AP fps Log Config Model
maYOLACT-550 ResNet-50 35.2 30 Coming Soon
maYOLACT-700 ResNet-50 37.7 25 Coming Soon

Running the Code

Training Code

The configuration files of all models listed above can be found in the configs/mayolact folder. You can follow get_started.md for training code. As an example, to train maYOLACT using images with 550 scale on 4 GPUs as we did, use the following command:

./tools/dist_train.sh configs/mayolact/mayolact_r50_4x8_coco_scale550.py 4

Test Code

The configuration files of all models listed above can be found in the configs/mayolact folder. You can follow get_started.md for test code. As an example, first download a trained model using the links provided in the tables below or you train a model, then run the following command to test a model model on multiple GPUs:

./tools/dist_test.sh configs/mayolact/mayolact_r50_4x8_coco_scale550.py ${CHECKPOINT_FILE} 4 --eval bbox segm 

You can also test a model on a single GPU with the following example command:

python tools/test.py configs/mayolact/mayolact_r50_4x8_coco_scale550.py ${CHECKPOINT_FILE} --eval bbox segm
Owner
Kemal Oksuz
Kemal Oksuz
Le dataset des images du projet d'IA de 2021

face-mask-dataset-ilc-2021 Le dataset des images du projet d'IA de 2021, Indiquez vos id git dans la issue pour les droits TL;DR: Choisir 200 images J

7 Nov 15, 2021
[CVPR 2020] Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

Contents Local and Global GAN Cross-View Image Translation Semantic Image Synthesis Acknowledgments Related Projects Citation Contributions Collaborat

Hao Tang 131 Dec 07, 2022
Python port of R's Comprehensive Dynamic Time Warp algorithm package

Welcome to the dtw-python package Comprehensive implementation of Dynamic Time Warping algorithms. DTW is a family of algorithms which compute the loc

Dynamic Time Warping algorithms 154 Dec 26, 2022
Codes for our paper The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders published to EMNLP 2021.

The Stem Cell Hypothesis Codes for our paper The Stem Cell Hypothesis: Dilemma behind Multi-Task Learning with Transformer Encoders published to EMNLP

Emory NLP 5 Jul 08, 2022
Real-time LIDAR-based Urban Road and Sidewalk detection for Autonomous Vehicles 🚗

urban_road_filter: a real-time LIDAR-based urban road and sidewalk detection algorithm for autonomous vehicles Dependency ROS (tested with Kinetic and

JKK - Vehicle Industry Research Center 180 Dec 12, 2022
Code for the paper: Adversarial Training Against Location-Optimized Adversarial Patches. ECCV-W 2020.

Adversarial Training Against Location-Optimized Adversarial Patches arXiv | Paper | Code | Video | Slides Code for the paper: Sukrut Rao, David Stutz,

Sukrut Rao 32 Dec 13, 2022
Image Completion with Deep Learning in TensorFlow

Image Completion with Deep Learning in TensorFlow See my blog post for more details and usage instructions. This repository implements Raymond Yeh and

Brandon Amos 1.3k Dec 23, 2022
A unified 3D Transformer Pipeline for visual synthesis

Overview This is the official repo for the paper: "NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion". NÜWA is a unified multimodal

Microsoft 2.6k Jan 03, 2023
The fastest way to visualize GradCAM with your Keras models.

VizGradCAM VizGradCam is the fastest way to visualize GradCAM in Keras models. GradCAM helps with providing visual explainability of trained models an

58 Nov 19, 2022
Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR)

This is the official implementation of our paper Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR), which has been accepted by WSDM2022.

Yongchun Zhu 81 Dec 29, 2022
FindFunc is an IDA PRO plugin to find code functions that contain a certain assembly or byte pattern, reference a certain name or string, or conform to various other constraints.

FindFunc: Advanced Filtering/Finding of Functions in IDA Pro FindFunc is an IDA Pro plugin to find code functions that contain a certain assembly or b

213 Dec 17, 2022
Relative Positional Encoding for Transformers with Linear Complexity

Stochastic Positional Encoding (SPE) This is the source code repository for the ICML 2021 paper Relative Positional Encoding for Transformers with Lin

Antoine Liutkus 48 Nov 16, 2022
Weighted QMIX: Expanding Monotonic Value Function Factorisation

This repo contains the cleaned-up code that was used in "Weighted QMIX: Expanding Monotonic Value Function Factorisation"

whirl 82 Dec 29, 2022
This is the code of paper ``Contrastive Coding for Active Learning under Class Distribution Mismatch'' with python.

Contrastive Coding for Active Learning under Class Distribution Mismatch Official PyTorch implementation of ["Contrastive Coding for Active Learning u

21 Dec 22, 2022
MagFace: A Universal Representation for Face Recognition and Quality Assessment

MagFace MagFace: A Universal Representation for Face Recognition and Quality Assessment in IEEE Conference on Computer Vision and Pattern Recognition

Qiang Meng 523 Jan 05, 2023
Gas detection for Raspberry Pi using ADS1x15 and MQ-2 sensors

Gas detection Gas detection for Raspberry Pi using ADS1x15 and MQ-2 sensors. Description The MQ-2 sensor can detect multiple gases (CO, H2, CH4, LPG,

Filip Å  15 Sep 30, 2022
FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning

FEDn is an open-source, modular and ML-framework agnostic framework for Federated Machine Learning (FedML) developed and maintained by Scaleout Systems. FEDn enables highly scalable cross-silo and cr

Scaleout 75 Nov 09, 2022
A Python implementation of active inference for Markov Decision Processes

A Python package for simulating Active Inference agents in Markov Decision Process environments. Please see our companion preprint on arxiv for an ove

235 Dec 21, 2022
PyTorch implementation of "Debiased Visual Question Answering from Feature and Sample Perspectives" (NeurIPS 2021)

D-VQA We provide the PyTorch implementation for Debiased Visual Question Answering from Feature and Sample Perspectives (NeurIPS 2021). Dependencies P

Zhiquan Wen 19 Dec 22, 2022
Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks

Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks - Official Project Page This repository contains the code develope

Amirsina Torfi 1.7k Dec 18, 2022