DanceTrack: Multiple Object Tracking in Uniform Appearance and Diverse Motion

Overview

DanceTrack

DanceTrack is a benchmark for tracking multiple objects in uniform appearance and diverse motion.

DanceTrack provides box and identity annotations.

DanceTrack contains 100 videos, 40 for training(annotations public), 25 for validation(annotations public) and 35 for testing(annotations unpublic). For evaluating on test set, please see CodaLab.


Paper

DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion

Dataset

Download the dataset from Google Drive or Baidu Drive (code:awew).

Organize as follows:

{DanceTrack ROOT}
|-- dancetrack
|   |-- train
|   |   |-- dancetrack0001
|   |   |   |-- img1
|   |   |   |   |-- 00000001.jpg
|   |   |   |   |-- ...
|   |   |   |-- gt
|   |   |   |   |-- gt.txt            
|   |   |   |-- seqinfo.ini
|   |   |-- ...
|   |-- val
|   |   |-- ...
|   |-- test
|   |   |-- ...
|   |-- train_seqmap.txt
|   |-- val_seqmap.txt
|   |-- test_seqmap.txt
|-- TrackEval
|-- tools
|-- ...

We align our dataset annotations with MOT, so each line in gt.txt contains:

<frame>, <id>, <bb_left>, <bb_top>, <bb_width>, <bb_height>, 1, 1, 1

Evaluation

We use ByteTrack as an example of using DanceTrack. For training details, please see instruction. We provide the trained models in Google Drive or or Baidu Drive (code:awew).

To do evaluation with our provided tookit, we organize the results of validation set as follows:

{DanceTrack ROOT}
|-- val
|   |-- TRACKER_NAME
|   |   |-- dancetrack000x.txt
|   |   |-- ...
|   |-- ...

where dancetrack000x.txt is the output file of the video episode dancetrack000x, each line of which contains:

<frame>, <id>, <bb_left>, <bb_top>, <bb_width>, <bb_height>, <conf>, -1, -1, -1

Then, simply run the evalution code:

python3 TrackEval/scripts/run_mot_challenge.py --SPLIT_TO_EVAL val  --METRICS HOTA CLEAR Identity  --GT_FOLDER dancetrack/val --SEQMAP_FILE dancetrack/val_seqmap.txt --SKIP_SPLIT_FOL True   --TRACKERS_TO_EVAL '' --TRACKER_SUB_FOLDER ''  --USE_PARALLEL True --NUM_PARALLEL_CORES 8 --PLOT_CURVES False --TRACKERS_FOLDER val/TRACKER_NAME 
Tracker HOTA DetA AssA MOTA IDF1
ByteTrack 47.1 70.5 31.5 88.2 51.9

Besides, we also provide the visualization script. The usage is as follow:

python3 tools/txt2video_dance.py --img_path dancetrack --split val --tracker TRACKER_NAME

Competition

Organize the results of test set as follows:

{DanceTrack ROOT}
|-- test
|   |-- tracker
|   |   |-- dancetrack000x.txt
|   |   |-- ...

Each line of dancetrack000x.txt contains:

<frame>, <id>, <bb_left>, <bb_top>, <bb_width>, <bb_height>, <conf>, -1, -1, -1

Archive tracker folder to tracker.zip and submit to CodaLab. Please note: (1) archive tracker folder, instead of txt files. (2) the folder name must be tracker.

The return will be:

Tracker HOTA DetA AssA MOTA IDF1
tracker 47.7 71.0 32.1 89.6 53.9

For more detailed metrics and metrics on each video, click on download output from scoring step in CodaLab.

Run the visualization code:

python3 tools/txt2video_dance.py --img_path dancetrack --split test --tracker tracker

Joint-Training

We use joint-training with other datasets to predict mask, pose and depth. CenterNet is provided as an example. For details of joint-trainig, please see joint-training instruction. We provide the trained models in Google Drive or Baidu Drive(code:awew).

For mask demo, run

cd CenterNet/src
python3 demo.py ctseg --demo  ../../dancetrack/val/dancetrack000x/img1 --load_model ../models/dancetrack_coco_mask.pth --debug 4 --tracking 
cd ../..
python3 tools/img2video.py --img_file CenterNet/exp/ctseg/default/debug --video_name dancetrack000x_mask.avi

For pose demo, run

cd CenterNet/src
python3 demo.py multi_pose --demo  ../../dancetrack/val/dancetrack000x/img1 --load_model ../models/dancetrack_coco_pose.pth --debug 4 --tracking 
cd ../..
python3 tools/img2video.py --img_file CenterNet/exp/multi_pose/default/debug --video_name dancetrack000x_pose.avi

For depth demo, run

cd CenterNet/src
python3 demo.py ddd --demo  ../../dancetrack/val/dancetrack000x/img1 --load_model ../models/dancetrack_kitti_ddd.pth --debug 4 --tracking --test_focal_length 640 --world_size 16 --out_size 128
cd ../..
python3 tools/img2video.py --img_file CenterNet/exp/ddd/default/debug --video_name dancetrack000x_ddd.avi

Agreement

  • The dataset of DanceTrack is available for non-commercial research purposes only.
  • All videos and images of DanceTrack are obtained from the Internet which are not property of HKU, CMU or ByteDance. These three organizations are not responsible for the content nor the meaning of these videos and images.
  • The code of DanceTrack is released under the MIT License.

Acknowledgement

The evaluation metrics and code are from MOT Challenge and TrackEval. The inference code is from ByteTrack. The joint-training code is modified from CenterTrack and CenterNet, where the instance segmentation code is from CenterNet-CondInst. Thanks for their wonderful and pioneering works !

Citation

If you use DanceTrack in your research or wish to refer to the baseline results published here, please use the following BibTeX entry:

@article{peize2021dance,
  title   =  {DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion},
  author  =  {Peize Sun and Jinkun Cao and Yi Jiang and Zehuan Yuan and Song Bai and Kris Kitani and Ping Luo},
  journal =  {arXiv preprint arXiv:2111.14690},
  year    =  {2021}
}
Code and datasets for the paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction"

KnowPrompt Code and datasets for our paper "KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction" Requireme

ZJUNLP 137 Dec 31, 2022
Using this you can control your PC/Laptop volume by Hand Gestures (pinch-in, pinch-out) created with Python.

Hand Gesture Volume Controller Using this you can control your PC/Laptop volume by Hand Gestures (pinch-in, pinch-out). Code Firstly I have created a

Tejas Prajapati 16 Sep 11, 2021
Autonomous Perception: 3D Object Detection with Complex-YOLO

Autonomous Perception: 3D Object Detection with Complex-YOLO LiDAR object detect

Thomas Dunlap 2 Feb 18, 2022
An self sufficient AI that crawls the web to learn how to generate art from keywords

Roxx-IO - The Smart Artist AI! TO DO / IDEAS Implement Web-Scraping Functionality Figure out a less annoying (and an off button for it) text to speech

Tatz 5 Mar 21, 2022
Unofficial implementation of MLP-Mixer: An all-MLP Architecture for Vision

MLP-Mixer: An all-MLP Architecture for Vision This repo contains PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision. Usage : impo

Rishikesh (ऋषिकेश) 175 Dec 23, 2022
Physics-Informed Neural Networks (PINN) and Deep BSDE Solvers of Differential Equations for Scientific Machine Learning (SciML) accelerated simulation

NeuralPDE NeuralPDE.jl is a solver package which consists of neural network solvers for partial differential equations using scientific machine learni

SciML Open Source Scientific Machine Learning 680 Jan 02, 2023
Clustering is a popular approach to detect patterns in unlabeled data

Visual Clustering Clustering is a popular approach to detect patterns in unlabeled data. Existing clustering methods typically treat samples in a data

Tarek Naous 24 Nov 11, 2022
Demonstrates how to divide a DL model into multiple IR model files (division) and introduce a simplest way to implement a custom layer works with OpenVINO IR models.

Demonstration of OpenVINO techniques - Model-division and a simplest-way to support custom layers Description: Model Optimizer in Intel(r) OpenVINO(tm

Yasunori Shimura 12 Nov 09, 2022
Code for the paper "M2m: Imbalanced Classification via Major-to-minor Translation" (CVPR 2020)

M2m: Imbalanced Classification via Major-to-minor Translation This repository contains code for the paper "M2m: Imbalanced Classification via Major-to

79 Oct 13, 2022
Multiple paper open-source codes of the Microsoft Research Asia DKI group

📫 Paper Code Collection (MSRA DKI Group) This repo hosts multiple open-source codes of the Microsoft Research Asia DKI Group. You could find the corr

Microsoft 249 Jan 08, 2023
an implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation using PyTorch

revisiting-sepconv This is a reference implementation of Revisiting Adaptive Convolutions for Video Frame Interpolation [1] using PyTorch. Given two f

Simon Niklaus 59 Dec 22, 2022
Language Models Can See: Plugging Visual Controls in Text Generation

Language Models Can See: Plugging Visual Controls in Text Generation Authors: Yixuan Su, Tian Lan, Yahui Liu, Fangyu Liu, Dani Yogatama, Yan Wang, Lin

Yixuan Su 195 Dec 22, 2022
Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization

Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization This repository contains the code for the BBI optimizer, introduced in the p

G. Bruno De Luca 5 Sep 06, 2022
Convert openmmlab (not only mmdetection) series model to tensorrt

MMDet to TensorRT This project aims to convert the mmdetection model to TensorRT model end2end. Focus on object detection for now. Mask support is exp

JinTian 4 Dec 17, 2021
Pretraining Representations For Data-Efficient Reinforcement Learning

Pretraining Representations For Data-Efficient Reinforcement Learning Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Ch

Mila 40 Dec 11, 2022
Minimal fastai code needed for working with pytorch

fastai_minima A mimal version of fastai with the barebones needed to work with Pytorch #all_slow Install pip install fastai_minima How to use This lib

Zachary Mueller 14 Oct 21, 2022
Compositional and Parameter-Efficient Representations for Large Knowledge Graphs

NodePiece - Compositional and Parameter-Efficient Representations for Large Knowledge Graphs NodePiece is a "tokenizer" for reducing entity vocabulary

Michael Galkin 107 Jan 04, 2023
Multi Agent Reinforcement Learning for ROS in 2D Simulation Environments

IROS21 information To test the code and reproduce the experiments, follow the installation steps in Installation.md. Afterwards, follow the steps in E

11 Oct 29, 2022
GAN-generated image detection based on CNNs

GAN-image-detection This repository contains a GAN-generated image detector developed to distinguish real images from synthetic ones. The detector is

Image and Sound Processing Lab 17 Dec 15, 2022
Implementation of SE3-Transformers for Equivariant Self-Attention, in Pytorch.

SE3 Transformer - Pytorch Implementation of SE3-Transformers for Equivariant Self-Attention, in Pytorch. May be needed for replicating Alphafold2 resu

Phil Wang 207 Dec 23, 2022