Multi-task Self-supervised Object Detection via Recycling of Bounding Box Annotations (CVPR, 2019)

Related tags

Deep Learningmtl-ssl
Overview

Multi-task Self-supervised Object Detection via Recycling of Bounding Box Annotations (CVPR 2019)

To make better use of given limited labels, we propose a novel object detection approach that takes advantage of both multi-task learning (MTL) and self-supervised learning (SSL). We propose a set of auxiliary tasks that help improve the accuracy of object detection.

Here is a guide to the source code.

Reference

If you are willing to use this code or cite the paper, please refer the following:

@inproceedings{lee2019multi,
 author = {Wonhee Lee and Joonil Na and Gunhee Kim},
 title = {Multi-task Self-supervised Object Detection via Recycling of Bounding Box Annotations},
 booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
 year = {2019}
}

CVPR Poster [PPT][PDF]

Introduction [PPT][PDF]

Multi-task Learning

Multi-task learning (MTL) aims at jointly training multiple relevant tasks with less annotations to improve the performance of each task.

[1] An Overview of Multi-Task Learning in Deep Neural Networks

[2] Mask R-CNN

Self-supervised Learning

Self-supervised learning (SSL) aims at training the model from the annotations generated by itself with no additional human effort.

[3] Learning Representations for Automatic Colorization

[4] Unsupervised learning of visual representations by solving jigsaw puzzles

Annotation Reuse

Reusing labels of one task is not only helpful to create new tasks and their labels but also capable of improving the performance of the main task through pretraining. Our work focuses on recycling bounding box labels for object detection.

[5] Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing

[6] Mix-and-Match Tuning for Self-Supervised Semantic Segmentation

Our approach

The key to our approach is to propose a set of auxiliary tasks that are relevant but not identical to object detection. They create their own labels by recycling the bounding box labels (e.g. annotations of the main task) in an SSL manner while regarding the bounding box as metadata. Then these auxiliary tasks are jointly trained with the object detection model in an MTL way.

Approach

Overall architecture

It shows how the object detector (i.e. main task model) such as Faster R-CNN makes a prediction for a given proposal box (red) with assistance of three auxiliary tasks at inference. The auxiliary task models (shown in the bottom right) are almost identical to the main task predictor except no box regressor. The refinement of detection prediction (shown in right) is also collectively done by cooperation of the main and auxiliary task models. K is the number of categories.

3 auxiliary tasks

This is an example of how to generate labels of auxiliary tasks via recycling of GT bounding boxes.

  • The multi-object soft label assigns the area portions occupied by each class’s GT boxes within a window.
  • The closeness label scores the distances from the center of the GT box to those of other GT boxes.
  • The foreground label is a binary mask between foreground and background.

Results

We empirically validate that our approach effectively improves detection performance on various architectures and datasets. We test two state-of-the-art region proposal object detectors, including Faster R-CNN and R-FCN, with three CNN backbones of ResNet-101, InceptionResNet-v2, and MobileNet on two benchmark datasets of PASCAL VOC and COCO.

Qualitative results

Qualitative comparison of detection results between baseline (left) and our approach (right) in each set. We divide the errors into five categories (Localization, Classification, Redundancy, Background, False Negative). Our approach often improves the baseline’s detection by correcting several false negatives and false positives such as background, similar object and redundant detection.

A simple python program that can be used to implement user authentication tokens into your program...

token-generator A simple python module that can be used by developers to implement user authentication tokens into your program... code examples creat

octo 6 Apr 18, 2022
A blender add-on that automatically re-aligns wrong axis objects.

Auto Align A blender add-on that automatically re-aligns wrong axis objects. Usage There are three options available in the 3D Viewport Sidebar It

29 Nov 25, 2022
Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach

This repository holds the implementation for paper Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach Download our preproc

Qitian Wu 42 Dec 27, 2022
LBK 35 Dec 26, 2022
This is a model made out of Neural Network specifically a Convolutional Neural Network model

This is a model made out of Neural Network specifically a Convolutional Neural Network model. This was done with a pre-built dataset from the tensorflow and keras packages. There are other alternativ

9 Oct 18, 2022
2021-AIAC-QQ-Browser-Hyperparameter-Optimization-Rank6

2021-AIAC-QQ-Browser-Hyperparameter-Optimization-Rank6

Aigege 8 Mar 31, 2022
Azua - build AI algorithms to aid efficient decision-making with minimum data requirements.

Project Azua 0. Overview Many modern AI algorithms are known to be data-hungry, whereas human decision-making is much more efficient. The human can re

Microsoft 197 Jan 06, 2023
Pretrained Pytorch face detection (MTCNN) and recognition (InceptionResnet) models

Face Recognition Using Pytorch Python 3.7 3.6 3.5 Status This is a repository for Inception Resnet (V1) models in pytorch, pretrained on VGGFace2 and

Tim Esler 3.3k Jan 04, 2023
MCMC samplers for Bayesian estimation in Python, including Metropolis-Hastings, NUTS, and Slice

Sampyl May 29, 2018: version 0.3 Sampyl is a package for sampling from probability distributions using MCMC methods. Similar to PyMC3 using theano to

Mat Leonard 304 Dec 25, 2022
Implementing a simplified copy of Shazam application from scratch using MinHashing and LSH.

Building Shazam from scratch In this repository we tried to implement a simplified copy of the Shazam application able to tell you the name of a song

Arturo Ghinassi 0 Nov 17, 2022
[内测中]前向式Python环境快捷封装工具,快速将Python打包为EXE并添加CUDA、NoAVX等支持。

QPT - Quick packaging tool 快捷封装工具 GitHub主页 | Gitee主页 QPT是一款可以“模拟”开发环境的多功能封装工具,最短只需一行命令即可将普通的Python脚本打包成EXE可执行程序,并选择性添加CUDA和NoAVX的支持,尽可能兼容更多的用户环境。 感觉还可

QPT Family 545 Dec 28, 2022
MLSpace: Hassle-free machine learning & deep learning development

MLSpace: Hassle-free machine learning & deep learning development

abhishek thakur 293 Jan 03, 2023
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN Pytorch implementation Inception score evaluation StackGAN-v2-pytorch Tensorflow implementation for reproducing main results in the paper Sta

Han Zhang 1.8k Dec 21, 2022
Adaptive Dropblock Enhanced GenerativeAdversarial Networks for Hyperspectral Image Classification

This repo holds the codes of our paper: Adaptive Dropblock Enhanced GenerativeAdversarial Networks for Hyperspectral Image Classification, which is ac

Feng Gao 17 Dec 28, 2022
Code for SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations

The Second Situated Interactive MultiModal Conversations (SIMMC 2.0) Challenge 2021 Welcome to the Second Situated Interactive Multimodal Conversation

Facebook Research 81 Nov 22, 2022
Cross-platform-profile-pic-changer - Script to change profile pictures across multiple platforms

cross-platform-profile-pic-changer script to change profile pictures across mult

4 Jan 17, 2022
Collection of Docker images for ML/DL and video processing projects

Collection of Docker images for ML/DL and video processing projects. Overview of images Three types of images differ by tag postfix: base: Python with

OSAI 87 Nov 22, 2022
Pytorch ImageNet1k Loader with Bounding Boxes.

ImageNet 1K Bounding Boxes For some experiments, you might wanna pass only the background of imagenet images vs passing only the foreground. Here, I'v

Amin Ghiasi 11 Oct 15, 2022
StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.

3k Jan 08, 2023
Machine Learning automation and tracking

The Open-Source MLOps Orchestration Framework MLRun is an open-source MLOps framework that offers an integrative approach to managing your machine-lea

873 Jan 04, 2023