TrackFormer: Multi-Object Tracking with Transformers

Overview

TrackFormer: Multi-Object Tracking with Transformers

This repository provides the official implementation of the TrackFormer: Multi-Object Tracking with Transformers paper by Tim Meinhardt, Alexander Kirillov, Laura Leal-Taixe and Christoph Feichtenhofer. The codebase builds upon DETR, Deformable DETR and Tracktor.

As the paper is still under submission this repository will continuously be updated and might at times not reflect the current state of the arXiv paper.

MOT17-03-SDP MOTS20-07

Abstract

The challenging task of multi-object tracking (MOT) requires simultaneous reasoning about track initialization, identity, and spatiotemporal trajectories. We formulate this task as a frame-to-frame set prediction problem and introduce TrackFormer, an end-to-end MOT approach based on an encoder-decoder Transformer architecture. Our model achieves data association between frames via attention by evolving a set of track predictions through a video sequence. The Transformer decoder initializes new tracks from static object queries and autoregressively follows existing tracks in space and time with the new concept of identity preserving track queries. Both decoder query types benefit from self- and encoder-decoder attention on global frame-level features, thereby omitting any additional graph optimization and matching or modeling of motion and appearance. TrackFormer represents a new tracking-by-attention paradigm and yields state-of-the-art performance on the task of multi-object tracking (MOT17) and segmentation (MOTS20).

TrackFormer casts multi-object tracking as a set prediction problem performing joint detection and tracking-by-attention. The architecture consists of a CNN for image feature extraction, a Transformer encoder for image feature encoding and a Transformer decoder which applies self- and encoder-decoder attention to produce output embeddings with bounding box and class information.

Installation

We refer to our docs/INSTALL.md for detailed installation instructions.

Train TrackFormer

We refer to our docs/TRAIN.md for detailed training instructions.

Evaluate TrackFormer

In order to evaluate TrackFormer on a multi-object tracking dataset, we provide the src/track.py script which supports several datasets and splits interchangle via the dataset_name argument (See src/datasets/tracking/factory.py for an overview of all datasets.) The default tracking configuration is specified in cfgs/track.yaml. To facilitate the reproducibility of our results, we provide evaluation metrics for both the train and test set.

MOT17

Private detections

python src/track.py reid
MOT17 MOTA IDF1 MT ML FP FN ID SW.
Train 68.1 67.6 816 207 33549 71937 1935
Test 65.0 63.9 1074 324 70443 123552 3528

Public detections (DPM, FRCNN, SDP)

python src/track.py with \
    reid \
    public_detections=min_iou_0_5 \
    obj_detect_checkpoint_file=models/mots20_train_masks/checkpoint.pth
MOT17 MOTA IDF1 MT ML FP FN ID SW.
Train 67.2 66.9 663 294 14640 94122 1866
Test 62.5 60.7 702 632 32828 174921 3917

MOTS20

python src/track.py with \
    dataset_name=MOTS20-ALL \
    obj_detect_checkpoint_file=models/mots20_train_masks/checkpoint.pth

Our tracking script only applies MOT17 metrics evaluation but outputs MOTS20 mask prediction files. To evaluate these download the official MOTChallengeEvalKit.

MOTS20 sMOTSA IDF1 FP FN IDs
Train -- -- -- -- --
Test 54.9 63.6 2233 7195 278

Demo

To facilitate the application of TrackFormer, we provide a demo interface which allows for a quick processing of a given video sequence.

ffmpeg -i data/snakeboard/snakeboard.mp4 -vf fps=30 data/snakeboard/%06d.png

python src/track.py with \
    dataset_name=DEMO \
    data_root_dir=data/snakeboard \
    output_dir=data/snakeboard \
    write_images=pretty
Snakeboard demo

Publication

If you use this software in your research, please cite our publication:

@InProceedings{meinhardt2021trackformer,
    title={TrackFormer: Multi-Object Tracking with Transformers},
    author={Tim Meinhardt and Alexander Kirillov and Laura Leal-Taixe and Christoph Feichtenhofer},
    year={2021},
    eprint={2101.02702},
    archivePrefix={arXiv},
}
Owner
Tim Meinhardt
Ph.D. candidate at the Dynamic Vision and Learning Group, TU Munich
Tim Meinhardt
A list of awesome PyTorch scholarship articles, guides, blogs, courses and other resources.

Awesome PyTorch Scholarship Resources A collection of awesome PyTorch and Python learning resources. Contributions are always welcome! Course Informat

Arnas Gečas 302 Dec 03, 2022
PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"

Transparency-by-Design networks (TbD-nets) This repository contains code for replicating the experiments and visualizations from the paper Transparenc

David Mascharka 351 Nov 18, 2022
Official implementation for: Blended Diffusion for Text-driven Editing of Natural Images.

Blended Diffusion for Text-driven Editing of Natural Images Blended Diffusion for Text-driven Editing of Natural Images Omri Avrahami, Dani Lischinski

328 Dec 30, 2022
SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking

SPLADE 🍴 + 🥄 = 🔎 This repository contains the weights for four models as well as the code for running inference for our two papers: [v1]: SPLADE: S

NAVER 170 Dec 28, 2022
Benchmark tools for Compressive LiDAR-to-map registration

Benchmark tools for Compressive LiDAR-to-map registration This repo contains the released version of code and datasets used for our IROS 2021 paper: "

Allie 9 Nov 24, 2022
SMD-Nets: Stereo Mixture Density Networks

SMD-Nets: Stereo Mixture Density Networks This repository contains a Pytorch implementation of "SMD-Nets: Stereo Mixture Density Networks" (CVPR 2021)

Fabio Tosi 115 Dec 26, 2022
Simple reference implementation of GraphSAGE.

Reference PyTorch GraphSAGE Implementation Author: William L. Hamilton Basic reference PyTorch implementation of GraphSAGE. This reference implementat

William L Hamilton 861 Jan 06, 2023
A modern pure-Python library for reading PDF files

pdf A modern pure-Python library for reading PDF files. The goal is to have a modern interface to handle PDF files which is consistent with itself and

6 Apr 06, 2022
Code for the Lovász-Softmax loss (CVPR 2018)

The Lovász-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks Maxim Berman, Amal Ranne

Maxim Berman 1.3k Jan 04, 2023
Pansharpening by convolutional neural networks in the full resolution framework

Z-PNN: Zoom Pansharpening Neural Network Pansharpening by convolutional neural networks in the full resolution framework is a deep learning method for

20 Nov 24, 2022
InferPy: Deep Probabilistic Modeling with Tensorflow Made Easy

InferPy: Deep Probabilistic Modeling Made Easy InferPy is a high-level API for probabilistic modeling written in Python and capable of running on top

PGM-Lab 141 Oct 13, 2022
Pytorch implementation for "Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion" (NeurIPS 2021)

Density-aware Chamfer Distance This repository contains the official PyTorch implementation of our paper: Density-aware Chamfer Distance as a Comprehe

Tong WU 93 Dec 15, 2022
HMLET (Hybrid-Method-of-Linear-and-non-linEar-collaborative-filTering-method)

Methods HMLET (Hybrid-Method-of-Linear-and-non-linEar-collaborative-filTering-method) Dynamically selecting the best propagation method for each node

Yong 7 Dec 18, 2022
A toolkit for Lagrangian-based constrained optimization in Pytorch

Cooper About Cooper is a toolkit for Lagrangian-based constrained optimization in Pytorch. This library aims to encourage and facilitate the study of

Cooper 34 Jan 01, 2023
High-quality implementations of standard and SOTA methods on a variety of tasks.

Uncertainty Baselines The goal of Uncertainty Baselines is to provide a template for researchers to build on. The baselines can be a starting point fo

Google 1.1k Dec 30, 2022
Improving Object Detection by Estimating Bounding Box Quality Accurately

Improving Object Detection by Estimating Bounding Box Quality Accurately Abstrac

2 Apr 14, 2022
Pytorch implementation of the paper SPICE: Semantic Pseudo-labeling for Image Clustering

SPICE: Semantic Pseudo-labeling for Image Clustering By Chuang Niu and Ge Wang This is a Pytorch implementation of the paper. (In updating) SOTA on 5

Chuang Niu 154 Dec 15, 2022
Detecting and Tracking Small and Dense Moving Objects in Satellite Videos: A Benchmark

This dataset is a large-scale dataset for moving object detection and tracking in satellite videos, which consists of 40 satellite videos captured by Jilin-1 satellite platforms.

Qingyong 87 Dec 22, 2022
TAug :: Time Series Data Augmentation using Deep Generative Models

TAug :: Time Series Data Augmentation using Deep Generative Models Note!!! The package is under development so be careful for using in production! Fea

35 Dec 06, 2022