SiamMOT is a region-based Siamese Multi-Object Tracking network that detects and associates object instances simultaneously.

Overview

SiamMOT

SiamMOT is a region-based Siamese Multi-Object Tracking network that detects and associates object instances simultaneously.

SiamMOT: Siamese Multi-Object Tracking,
Bing Shuai, Andrew Berneshawi, Xinyu Li, Davide Modolo, Joseph Tighe,

@inproceedings{shuai2021siammot,
  title={SiamMOT: Siamese Multi-Object Tracking},
  author={Shuai, Bing and Berneshawi, Andrew and Li, Xinyu and Modolo, Davide and Tighe, Joseph},
  booktitle={CVPR},
  year={2021}
}

Abstract

In this paper, we focus on improving online multi-object tracking (MOT). In particular, we introduce a region-based Siamese Multi-Object Tracking network, which we name SiamMOT. SiamMOT includes a motion model that estimates the instance’s movement between two frames such that detected instances are associated. To explore how the motion modelling affects its tracking capability, we present two variants of Siamese tracker, one that implicitly models motion and one that models it explicitly. We carry out extensive quantitative experiments on three different MOT datasets: MOT17, TAO-person and Caltech Roadside Pedestrians, showing the importance of motion modelling for MOT and the ability of SiamMOT to substantially outperform the state-of-the-art. Finally, SiamMOT also outperforms the winners of ACM MM’20 HiEve Grand Challenge on HiEve dataset. Moreover, SiamMOT is efficient, and it runs at 17 FPS for 720P videos on a single modern GPU.

Installation

Please refer to INSTALL.md for installation instructions.

Try SiamMOT demo

For demo purposes, we provide two tracking models -- tracking person (visible part) or jointly tracking person and vehicles (bus, car, truck, motorcycle, etc). The person tracking model is trained on COCO-17 and CrowdHuman, while the latter model is trained on COCO-17 and VOC12. Currently, both models used in demos use EMM as its motion model, which performs best among different alternatives.

In order to run the demo, use the following command:

python3 demos/demo.py --demo-video  PATH_TO_DEMO_VIDE --track-class person --dump-video True

You can choose person or person_vehicel for track-class such that person tracking or person/vehicle tracking model is used accordingly.

The model would be automatically downloaded to demos/models, and the visualization of tracking outputs is automatically saved to demos/demo_vis

We also provide several pre-trained models in model_zoos.md that can be used for demo.

Dataset Evaluation and Training

After installation, follow the instructions in DATA.md to setup the datasets. As a sanity check, the models presented in model_zoos.md can be used to for benchmark testing.

Use the following command to train a model on an 8-GPU machine: Before running training / inference, setup the configuration file properly

python3 -m torch.distributed.launch --nproc_per_node=8 tools/train_net.py --config-file configs/dla/DLA_34_FPN.yaml --train-dir PATH_TO_TRAIN_DIR --model-suffix MODEL_SUFFIX 

Use the following command to test a model on a single-GPU machine:

python3 tools/test_net.py --config-file configs/dla/DLA_34_FPN.yaml --output-dir PATH_TO_OUTPUT_DIR --model-file PATH_TO_MODEL_FILE --test-dataset DATASET_KEY --set val

Note: If you get an error ModuleNotFoundError: No module named 'siammot' when running in the git root then make sure your PYTHONPATH includes the current directory, which you can add by running: export PYTHONPATH=.:$PYTHONPATH or you can explicitly add the project to the path by replacing the '.' in the export command with the absolute path to the git root.

Multi-gpu testing is going to be supported later.

Version

This is the preliminary version specifically for Airbone Object Tracking (AOT) workshop. The current version only support the motion model being EMM.

We will add more motion models in the next version, together with more features, stay tuned.

License

This project is licensed under the Apache-2.0 License.

Simulation environments for the CrazyFlie quadrotor: Used for Reinforcement Learning and Sim-to-Real Transfer

Phoenix-Drone-Simulation An OpenAI Gym environment based on PyBullet for learning to control the CrazyFlie quadrotor: Can be used for Reinforcement Le

Sven Gronauer 8 Dec 07, 2022
PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

Lip to Speech Synthesis with Visual Context Attentional GAN This repository contains the PyTorch implementation of the following paper: Lip to Speech

6 Nov 02, 2022
VR-Caps: A Virtual Environment for Active Capsule Endoscopy

VR-Caps: A Virtual Environment for Capsule Endoscopy Overview We introduce a virtual active capsule endoscopy environment developed in Unity that prov

DeepMIA Lab 90 Dec 27, 2022
Streaming Anomaly Detection Framework in Python (Outlier Detection for Streaming Data)

Python Streaming Anomaly Detection (PySAD) PySAD is an open-source python framework for anomaly detection on streaming multivariate data. Documentatio

Selim Firat Yilmaz 181 Dec 18, 2022
A simple, fully convolutional model for real-time instance segmentation.

You Only Look At CoefficienTs ██╗ ██╗ ██████╗ ██╗ █████╗ ██████╗████████╗ ╚██╗ ██╔╝██╔═══██╗██║ ██╔══██╗██╔════╝╚══██╔══╝ ╚██

Daniel Bolya 4.6k Dec 30, 2022
Greedy Gaussian Segmentation

GGS Greedy Gaussian Segmentation (GGS) is a Python solver for efficiently segmenting multivariate time series data. For implementation details, please

Stanford University Convex Optimization Group 72 Dec 07, 2022
A PyTorch Implementation of PGL-SUM from "Combining Global and Local Attention with Positional Encoding for Video Summarization", Proc. IEEE ISM 2021

PGL-SUM: Combining Global and Local Attention with Positional Encoding for Video Summarization PyTorch Implementation of PGL-SUM From "PGL-SUM: Combin

Evlampios Apostolidis 35 Dec 22, 2022
Implementation of "Semi-supervised Domain Adaptive Structure Learning"

Semi-supervised Domain Adaptive Structure Learning - ASDA This repo contains the source code and dataset for our ASDA paper. Illustration of the propo

3 Dec 13, 2021
An Implicit Function Theorem (IFT) optimizer for bi-level optimizations

iftopt An Implicit Function Theorem (IFT) optimizer for bi-level optimizations. Requirements Python 3.7+ PyTorch 1.x Installation $ pip install git+ht

The Money Shredder Lab 2 Dec 02, 2021
Official implementation of particle-based models (GNS and DPI-Net) on the Physion dataset.

Physion: Evaluating Physical Prediction from Vision in Humans and Machines [paper] Daniel M. Bear, Elias Wang, Damian Mrowca, Felix J. Binder, Hsiao-Y

Hsiao-Yu Fish Tung 18 Dec 19, 2022
Official implementation of the paper ``Unifying Nonlocal Blocks for Neural Networks'' (ICCV'21)

Spectral Nonlocal Block Overview Official implementation of the paper: Unifying Nonlocal Blocks for Neural Networks (ICCV'21) Spectral View of Nonloca

91 Dec 14, 2022
Python implementation of 3D facial mesh exaggeration using the techniques described in the paper: Computational Caricaturization of Surfaces.

Python implementation of 3D facial mesh exaggeration using the techniques described in the paper: Computational Caricaturization of Surfaces.

Wonjong Jang 8 Nov 01, 2022
ICLR2021 (Under Review)

Self-Supervised Time Series Representation Learning by Inter-Intra Relational Reasoning This repository contains the official PyTorch implementation o

Haoyi Fan 58 Dec 30, 2022
A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN.

Ryan Murdock has done it again, combining OpenAI's CLIP and the generator from a BigGAN! This repository wraps up his work so it is easily accessible to anyone who owns a GPU.

Phil Wang 2.3k Jan 09, 2023
Pretraining Representations For Data-Efficient Reinforcement Learning

Pretraining Representations For Data-Efficient Reinforcement Learning Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Ch

Mila 40 Dec 11, 2022
For IBM Quantum Challenge 2021 (May 20 - 26)

IBM Quantum Challenge 2021 Introduction Commemorating the 40-year anniversary of the Physics of Computation conference, and 5-year anniversary of IBM

Qiskit Community 140 Jan 01, 2023
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

Microsoft 8.4k Jan 01, 2023
Efficient 3D human pose estimation in video using 2D keypoint trajectories

3D human pose estimation in video with temporal convolutions and semi-supervised training This is the implementation of the approach described in the

Meta Research 3.1k Dec 29, 2022
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

DSEE Codes for [Preprint] DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models Xuxi Chen, Tianlong Chen, Yu Cheng, Weizhu Ch

VITA 4 Dec 27, 2021