Official Datasets and Implementation from our Paper "Video Class Agnostic Segmentation in Autonomous Driving".

Overview

Video Class Agnostic Segmentation

[Method Paper] [Benchmark Paper] [Project] [Demo]

Official Datasets and Implementation from our Paper "Video Class Agnostic Segmentation Benchmark in Autonomous Driving" in Workshop on Autonomous Driving, CVPR 2021.



Installation

This repo is tested under Python 3.6, PyTorch 1.4

  • Download Required Packages
pip install -r requirements.txt
pip install "git+https://github.com/cocodataset/panopticapi.git"
  • Setup mmdet
python setup.py develop

Motion Segmentation Track

Dataset Preparation

Inference

  • Download Trained Weights on Ego Flow Suppressed, trained on Cityscapes and KITTI-MOTS

  • Modify Configs according to dataset path + Image/Annotation/Flow prefix

configs/data/kittimots_motion_supp.py
configs/data/cscapesvps_motion_supp.py
  • Evaluate CAQ,
python tools/test_eval_caq.py CONFIG_FILE WEIGHTS_FILE

CONFIG_FILE: configs/infer_kittimots.py or configs/infer_cscapesvps.py

  • Qualitative Results
python tools/test_vis.py CONFIG_FILE WEIGHTS_FILE --vis_unknown --save_dir OUTS_DIR
  • Evaluate Image Panoptic Quality, Note: evaluated on 1024x2048 Images
python tools/test_eval_ipq.py configs/infer_cscapesvps_pq.py WEIGHTS_FILE --out PKL_FILE

Training

Coming Soon ...

Open-set Segmentation Track

Coming soon ...

Acknowledgements

Dataset and Repository relied on these sources:

  • Voigtlaender, Paul, et al. "Mots: Multi-object tracking and segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
  • Kim, Dahun, et al. "Video panoptic segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
  • Wang, Xinlong, et al. "Solo: Segmenting objects by locations." European Conference on Computer Vision. Springer, Cham, 2020.
  • This Repository built upon SOLO Code

Citation

@article{siam2021video,
      title={Video Class Agnostic Segmentation Benchmark for Autonomous Driving}, 
      author={Mennatullah Siam and Alex Kendall and Martin Jagersand},
      year={2021},
      eprint={2103.11015},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Contact

If you have any questions regarding the dataset or repository, please contact [email protected].

Owner
Mennatullah Siam
PhD Student
Mennatullah Siam
TransReID: Transformer-based Object Re-Identification

TransReID: Transformer-based Object Re-Identification [arxiv] The official repository for TransReID: Transformer-based Object Re-Identification achiev

569 Dec 30, 2022
[ArXiv 2021] One-Shot Generative Domain Adaptation

GenDA - One-Shot Generative Domain Adaptation One-Shot Generative Domain Adaptation Ceyuan Yang*, Yujun Shen*, Zhiyi Zhang, Yinghao Xu, Jiapeng Zhu, Z

GenForce: May Generative Force Be with You 46 Dec 19, 2022
🕺Full body detection and tracking

Pose-Detection 🤔 Overview Human pose estimation from video plays a critical role in various applications such as quantifying physical exercises, sign

Abbas Ataei 20 Nov 21, 2022
Source code for ZePHyR: Zero-shot Pose Hypothesis Rating @ ICRA 2021

ZePHyR: Zero-shot Pose Hypothesis Rating ZePHyR is a zero-shot 6D object pose estimation pipeline. The core is a learned scoring function that compare

R-Pad - Robots Perceiving and Doing 18 Aug 22, 2022
Xi Dongbo 78 Nov 29, 2022
BiSeNet based on pytorch

BiSeNet BiSeNet based on pytorch 0.4.1 and python 3.6 Dataset Download CamVid dataset from Google Drive or Baidu Yun(6xw4). Pretrained model Download

367 Dec 26, 2022
Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)

DocFormer - PyTorch Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for t

171 Jan 06, 2023
Repository of Vision Transformer with Deformable Attention

Vision Transformer with Deformable Attention This repository contains the code for the paper Vision Transformer with Deformable Attention [arXiv]. Int

410 Jan 03, 2023
PyTorch code of paper "LiVLR: A Lightweight Visual-Linguistic Reasoning Framework for Video Question Answering"

LiVLR-VideoQA We propose a Lightweight Visual-Linguistic Reasoning framework (LiVLR) for VideoQA. The overview of LiVLR: Evaluation on MSRVTT-QA Datas

JJ Jiang 7 Dec 30, 2022
SenseNet is a sensorimotor and touch simulator for deep reinforcement learning research

SenseNet is a sensorimotor and touch simulator for deep reinforcement learning research

59 Feb 25, 2022
[NeurIPS 2019] Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss

Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss Kaidi Cao, Colin Wei, Adrien Gaidon, Nikos Arechiga, Tengyu Ma This is the offi

Kaidi Cao 528 Jan 01, 2023
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

Machine Learning From Scratch About Python implementations of some of the fundamental Machine Learning models and algorithms from scratch. The purpose

Erik Linder-Norén 21.8k Jan 09, 2023
This code is an implementation for Singing TTS.

MLP Singer This code is an implementation for Singing TTS. The algorithm is based on the following papers: Tae, J., Kim, H., & Lee, Y. (2021). MLP Sin

Heejo You 22 Dec 23, 2022
A Tensorflow implementation of BicycleGAN.

BicycleGAN implementation in Tensorflow As part of the implementation series of Joseph Lim's group at USC, our motivation is to accelerate (or sometim

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 97 Dec 02, 2022
Official code of paper "PGT: A Progressive Method for Training Models on Long Videos" on CVPR2021

PGT Code for paper PGT: A Progressive Method for Training Models on Long Videos. Install Run pip install -r requirements.txt. Run python setup.py buil

Bo Pang 27 Mar 30, 2022
SMIS - Semantically Multi-modal Image Synthesis(CVPR 2020)

Semantically Multi-modal Image Synthesis Project page / Paper / Demo Semantically Multi-modal Image Synthesis(CVPR2020). Zhen Zhu, Zhiliang Xu, Anshen

316 Dec 01, 2022
A lightweight Python-based 3D network multi-agent simulator. Uses a cell-based congestion model. Calculates risk, loudness and battery capacities of the agents. Suitable for 3D network optimization tasks.

AMAZ3DSim AMAZ3DSim is a lightweight python-based 3D network multi-agent simulator. It uses a cell-based congestion model. It calculates risk, battery

Daniel Hirsch 13 Nov 04, 2022
Lane follower: Lane-detector (OpenCV) + Object-detector (YOLO5) + CAN-bus

Lane Follower This code is for the lane follower, including perception and control, as shown below. Environment Hardware Industrial Camera Intel-NUC(1

Siqi Fan 3 Jul 07, 2022
This repository contains the files for running the Patchify GUI.

Repository Name Train-Test-Validation-Dataset-Generation App Name Patchify Description This app is designed for crop images and creating smal

Salar Ghaffarian 9 Feb 15, 2022
TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

SLM: Structural Language Models of Code This is an official implementation of the model described in: "Structural Language Models of Code" [PDF] To ap

73 Nov 06, 2022