PolyTrack: Tracking with Bounding Polygons

Overview

PolyTrack: Tracking with Bounding Polygons

Abstract

In this paper, we present a novel method called PolyTrack for fast multi-object tracking and segmentation using bounding polygons. Polytrack detects objects by producing heatmaps of their center keypoint. For each of them, a rough segmentation is done by computing a bounding polygon over each instance instead of the traditional bounding box. Tracking is done by taking two consecutive frames as input and computing a center offset for each object detected in the first frame to predict their location in the second frame. A Kalman filter is also applied to reduce the number of ID switches. Since our target application is automated driving systems, we apply our method on urban environment videos. We train and evaluate PolyTrack on the MOTS and KITTIMOTS dataset.

Example results

Video examples from the KITTI MOTS test set:

Model

An overview of the PolyTrack architecture. The network takes as input the image at time t, I(t), the image at time t-1, I(t-1), as well as the heatmap at time t-1, H(t-1). Features are produced by the backbone and then used by five different network heads. The center heatmaps head is used for detecting and classifying objects, the polygon head is used for the segmentation part, the depth head is used to produce a relative depth between objects, the tracking head is used to produce an offset between frames at time t-1 and time t and finally the offset head is used for correctly upsampling images.

a) Generated Heatmap b) Generated Output

a): The center heatmap produced by the network to detect objects, b): the output of our method: a bounding polygon for each object, a class label, a track id as well as an offset from the previous frame.

Installation

Please refer to INSTALL.md for installation instructions.

Folder organization

  • /experiments: bash files to start repeat our experiments, you can also find an example of how to perform a demo.
  • /src/lib : contains the code needed to generate and train a model
  • /src/tools : contains tools relevant to different datasets, you can find the files we used to generate our ground truth here.
  • /data : not included in the git repo, but contains images from the dataset with the following structure:
  • /data/MOTS/test/ : contains test images
  • /data/MOTS/train/ : contains train images
  • /data/MOTS/seqmaps/ : contains seqmaps
  • /data/MOTS/json_gt/ : contains ground truth files generated by our tools

License

PolyTrack is released under the MIT License. PolyTrack is based upon CenterTrack and CenterPoly. Portions of the code are borrowed from CornerNet (hourglassnet, loss functions), dla (DLA network) and DCNv2(deformable convolutions). Please refer to the original License of these projects (See NOTICE).

Owner
Gaspar Faure
Gaspar Faure
Vision transformers (ViTs) have found only limited practical use in processing images

CXV Convolutional Xformers for Vision Vision transformers (ViTs) have found only limited practical use in processing images, in spite of their state-o

Cloudwalker 23 Sep 10, 2022
Multi-resolution SeqMatch based long-term Place Recognition

MRS-SLAM for long-term place recognition In this work, we imply an multi-resolution sambling based visual place recognition method. This work is based

METASLAM 6 Dec 06, 2022
Contains modeling practice materials and homework for the Computational Neuroscience course at Okinawa Institute of Science and Technology

A310 Computational Neuroscience - Okinawa Institute of Science and Technology, 2022 This repository contains modeling practice materials and homework

Sungho Hong 1 Jan 24, 2022
Source code of our work: "Benchmarking Deep Models for Salient Object Detection"

SALOD Source code of our work: "Benchmarking Deep Models for Salient Object Detection". In this works, we propose a new benchmark for SALient Object D

22 Dec 30, 2022
Code for 'Blockwise Sequential Model Learning for Partially Observable Reinforcement Learning' (AAAI 2022)

Blockwise Sequential Model Learning Code for 'Blockwise Sequential Model Learning for Partially Observable Reinforcement Learning' (AAAI 2022) For ins

2 Jun 17, 2022
Github Traffic Insights as Prometheus metrics.

github-traffic Github Traffic collects your repository's traffic data and exposes it as Prometheus metrics. Grafana dashboard that displays the metric

Grafana Labs 34 Oct 27, 2022
A solution to ensure Crowd Management with Contactless and Safe systems.

CovidTrack A Solution to ensure Crowd Management with Contactless and Safe systems. ML Model Mask Detection Social Distancing Detection Analytics Page

Om Khare 1 Nov 10, 2021
Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019)

Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019) Introduction Official implementation of Dynamic Multi-scale Filters for Semant

23 Oct 21, 2022
Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Models.

Statutory Interpretation Data Set This repository contains the data set created for the following research papers: Savelka, Jaromir, and Kevin D. Ashl

17 Dec 23, 2022
MapReader: A computer vision pipeline for the semantic exploration of maps at scale

MapReader A computer vision pipeline for the semantic exploration of maps at scale MapReader is an end-to-end computer vision (CV) pipeline designed b

Living with Machines 25 Dec 26, 2022
Official codebase for Pretrained Transformers as Universal Computation Engines.

universal-computation Overview Official codebase for Pretrained Transformers as Universal Computation Engines. Contains demo notebook and scripts to r

Kevin Lu 210 Dec 28, 2022
A repository with exploration into using transformers to predict DNA ↔ transcription factor binding

Transcription Factor binding predictions with Attention and Transformers A repository with exploration into using transformers to predict DNA ↔ transc

Phil Wang 62 Dec 20, 2022
Learning Continuous Signed Distance Functions for Shape Representation

DeepSDF This is an implementation of the CVPR '19 paper "DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation" by Park et a

Meta Research 1.1k Jan 01, 2023
TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

TCNN Pandey A, Wang D L. TCNN: Temporal convolutional neural network for real-time speech enhancement in the time domain[C]//ICASSP 2019-2019 IEEE Int

凌逆战 16 Dec 30, 2022
[ICCV2021] Official code for "Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition"

CTR-GCN This repo is the official implementation for Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition. The pap

Yuxin Chen 148 Dec 16, 2022
A Streamlit component to render ECharts.

Streamlit - ECharts A Streamlit component to display ECharts. Install pip install streamlit-echarts Usage This library provides 2 functions to display

Fanilo Andrianasolo 290 Dec 30, 2022
Super Resolution for images using deep learning.

Neural Enhance Example #1 — Old Station: view comparison in 24-bit HD, original photo CC-BY-SA @siv-athens. As seen on TV! What if you could increase

Alex J. Champandard 11.7k Dec 29, 2022
StyleMapGAN - Official PyTorch Implementation

StyleMapGAN - Official PyTorch Implementation StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing Hyunsu Kim, Yunj

NAVER AI 425 Dec 23, 2022
HistoKT: Cross Knowledge Transfer in Computational Pathology

HistoKT: Cross Knowledge Transfer in Computational Pathology Exciting News! HistoKT has been accepted to ICASSP 2022. HistoKT: Cross Knowledge Transfe

Mahdi S. Hosseini 5 Jan 05, 2023
Over-the-Air Ensemble Inference with Model Privacy

Over-the-Air Ensemble Inference with Model Privacy This repository contains simulations for our private ensemble inference method. Installation Instal

Selim Firat Yilmaz 1 Jun 29, 2022