Object detection on multiple datasets with an automatically learned unified label space.

Overview

Simple multi-dataset detection

An object detector trained on multiple large-scale datasets with a unified label space; Winning solution of ECCV 2020 Robust Vision Challenges.

Simple multi-dataset detection,
Xingyi Zhou, Vladlen Koltun, Philipp Krähenbühl,
arXiv technical report (arXiv 2102.13086)

Contact: [email protected]. Any questions or discussions are welcomed!

Abstract

How do we build a general and broad object detection system? We use all labels of all concepts ever annotated. These labels span diverse datasets with potentially inconsistent taxonomies. In this paper, we present a simple method for training a unified detector on multiple large-scale datasets. We use dataset-specific training protocols and losses, but share a common detection architecture with dataset-specific outputs. We show how to automatically integrate these dataset-specific outputs into a common semantic taxonomy. In contrast to prior work, our approach does not require manual taxonomy reconciliation. Our multi-dataset detector performs as well as dataset-specific models on each training domain, but generalizes much better to new unseen domains. Entries based on the presented methodology ranked first in the object detection and instance segmentation tracks of the ECCV 2020 Robust Vision Challenge.

Features at a glance

  • We trained a unified object detector on 4 large-scale detection datasets: COCO, Objects365, OpenImages, and Mapillary, with state-of-the-art performance on all of them.

  • The model predicts class labels in a learned unified label space.

  • The model can be directly used to test on novel datasets outside the training datasets.

  • In this repo, we also provide state-of-the-art baselines for Objects365 and OpenImages.

Main results

COCO test-challenge OpenImages public test Mapillary test Objects365 val
52.9 60.6 25.3 33.7

Results are obtained using a Cascade-RCNN with ResNeSt200 trained in an 8x schedule.

  • Unified model vs. ensemble of dataset-specific models with known test domains.
COCO Objects365 OpenImages mean.
Unified 45.4 24.4 66.0 45.3
Dataset-specific models 42.5 24.9 65.7 44.4

Results are obtained using a Cascade-RCNN with Res50 trained in an 8x schedule.

  • Zero-shot cross dataset evaluation
VOC VIPER CityScapes ScanNet WildDash CrowdHuman KITTI mean
Unified 82.9 21.3 52.6 29.8 34.7 70.7 39.9 47.3
Oracle models 80.3 31.8 54.6 44.7 - 80.0 - -

Results are obtained using a Cascade-RCNN with Res50 trained in an 8x schedule.

More models can be found in our MODEL ZOO.

Installation

Our project is developed on detectron2. Please follow the official detectron2 installation. All our code is under projects/UniDet/. In theory, you should be able to copy-paste projects/UniDet/ to the latest detectron2 release or your own detectron2 repo to run our project. There might be API changes in future detectron2 releases that make it incompatible.

Demo

We use the same inference API as detectorn2. To run inference on an image folder using our pretrained model, run

python projects/UniDet/demo/demo.py --config-file projects/UniDet/configs/Unified_learned_OCIM_R50_6x+2x.yaml --input images/*.jpg --opts MODEL.WEIGHTS models/Unified_learned_OCIM_R50_6x+2x.pth

If setup correctly, the output should look like:

*The sample image is from WildDash dataset.

Note that the model predicts all labels in its label hierarchy tree (for example, both vehicle and car for a car), following the protocol in OpenImages.

Benchmark evaluation and training

After installation, follow the instructions in DATASETS.md to setup the (many) datasets. Then check REPRODUCE.md to reproduce the results in the paper.

License

All our code under projects/Unidet/ is under Apache 2.0 license. The code from detectron2 follows the original Apache 2.0 license.

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@inproceedings{zhou2021simple,
  title={Simple multi-dataset detection},
  author={Zhou, Xingyi and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
  booktitle={arXiv preprint arXiv:2102.13086},
  year={2021}
}
Owner
Xingyi Zhou
CS Ph.D. student at UT Austin.
Xingyi Zhou
Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression", TIP 2020

Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multil

Xuefeng 5 Jan 15, 2022
All course materials for the Zero to Mastery Machine Learning and Data Science course.

Zero to Mastery Machine Learning Welcome! This repository contains all of the code, notebooks, images and other materials related to the Zero to Maste

Daniel Bourke 1.6k Jan 08, 2023
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation This is a demo implementation of BYOL for Audio (BYOL-A), a self-sup

NTT Communication Science Laboratories 160 Jan 04, 2023
Software & Hardware to do multi color printing with Sharpies

3D Print Colorizer is a combination of 3D printed parts and a Cura plugin which allows anyone with an Ender 3 like 3D printer to produce multi colored

343 Jan 06, 2023
Paddle-Adversarial-Toolbox (PAT) is a Python library for Deep Learning Security based on PaddlePaddle.

Paddle-Adversarial-Toolbox Paddle-Adversarial-Toolbox (PAT) is a Python library for Deep Learning Security based on PaddlePaddle. Model Zoo Common FGS

AgentMaker 17 Nov 08, 2022
Code for "Solving Graph-based Public Good Games with Tree Search and Imitation Learning"

Code for "Solving Graph-based Public Good Games with Tree Search and Imitation Learning" This is the code for the paper Solving Graph-based Public Goo

Victor-Alexandru Darvariu 3 Dec 05, 2022
Polynomial-time Meta-Interpretive Learning

Louise - polynomial-time Program Learning Getting help with Louise Louise's author can be reached by email at Stassa Patsantzis 64 Dec 26, 2022

The official implementation of NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021]. https://arxiv.org/pdf/2101.12378.pdf

NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [ICLR-2021] Release Notes The offical PyTorch implementation of NeMo, p

Angtian Wang 76 Nov 23, 2022
Tensorflow 2 implementation of the paper: Learning and Evaluating Representations for Deep One-class Classification published at ICLR 2021

Deep Representation One-class Classification (DROC). This is not an officially supported Google product. Tensorflow 2 implementation of the paper: Lea

Google Research 137 Dec 23, 2022
Code for LIGA-Stereo Detector, ICCV'21

LIGA-Stereo Introduction This is the official implementation of the paper LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based

Xiaoyang Guo 75 Dec 09, 2022
Repository for GNSS-based position estimation using a Deep Neural Network

Code repository accompanying our work on 'Improving GNSS Positioning using Neural Network-based Corrections'. In this paper, we present a Deep Neural

32 Dec 13, 2022
[NeurIPS 2021] Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects

[NeurIPS 2021] Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects YouTube | arXiv Prerequisites Kaolin is available here:

Denys Rozumnyi 107 Dec 26, 2022
Deep Learning Models for Causal Inference

Extensive tutorials for learning how to build deep learning models for causal inference using selection on observables in Tensorflow 2.

Bernard J Koch 151 Dec 31, 2022
The Simplest DCGAN Implementation

DCGAN in TensorLayer This is the TensorLayer implementation of Deep Convolutional Generative Adversarial Networks. Looking for Text to Image Synthesis

TensorLayer Community 310 Dec 13, 2022
SigOpt wrappers for scikit-learn methods

SigOpt + scikit-learn Interfacing This package implements useful interfaces and wrappers for using SigOpt and scikit-learn together Getting Started In

SigOpt 73 Sep 30, 2022
Model that predicts the probability of a Twitter user being anti-vaccination.

stylebody {text-align: justify}/style AVAXTAR: Anti-VAXx Tweet AnalyzeR AVAXTAR is a python package to identify anti-vaccine users on twitter. The

10 Sep 27, 2022
CondenseNet V2: Sparse Feature Reactivation for Deep Networks

CondenseNetV2 This repository is the official Pytorch implementation for "CondenseNet V2: Sparse Feature Reactivation for Deep Networks" paper by Le Y

Haojun Jiang 74 Dec 12, 2022
Convert Apple NeuralHash model for CSAM Detection to ONNX.

Apple NeuralHash is a perceptual hashing method for images based on neural networks. It can tolerate image resize and compression.

Asuhariet Ygvar 1.5k Dec 31, 2022
PyBrain - Another Python Machine Learning Library.

PyBrain -- the Python Machine Learning Library =============================================== INSTALLATION ------------ Quick answer: make sure you

2.8k Dec 31, 2022
A super lightweight Lagrangian model for calculating millions of trajectories using ERA5 data

Easy-ERA5-Trck Easy-ERA5-Trck Galleries Install Usage Repository Structure Module Files Version iteration Easy-ERA5-Trck is a super lightweight Lagran

Zhenning Li 26 Nov 19, 2022