Official code for "EagerMOT: 3D Multi-Object Tracking via Sensor Fusion" [ICRA 2021]

Last update: Dec 30, 2022

Overview

EagerMOT: 3D Multi-Object Tracking via Sensor Fusion

Read our ICRA 2021 paper here.

Check out the 3 minute video for the quick intro or the full presentation video for more details.

This repo contains code for our ICRA 2021 paper. Benchmark results can be fully reproduced with minimal work, only need to edit data location variables. If desired, our ablation results can also be reproduced by need more adjustments. An earlier version of this paper has also appeared as a short 4-page paper at the CVPR 2020 MOTChallenge Workshop.

Improve your online 3D multi-object tracking performance by using 2D detections to support tracking when 3D association fails. The method adds minimal overhead, does not rely on dedicated hardware on any particular sensor setup. The current Python implementation run at 90 FPS on KITTI data and can definitely be optimized for actual deployment.

The framework is flexible to work with any 3D/2D detection sources (we used only off-the-shelf models) and can be extended to other tracking-related tasks, e.g. MOTS.

Abstract

Multi-object tracking (MOT) enables mobile robots to perform well-informed motion planning and navigation by localizing surrounding objects in 3D space and time. Existing methods rely on depth sensors (e.g., LiDAR) to detect and track targets in 3D space, but only up to a limited sensing range due to the sparsity of the signal. On the other hand, cameras provide a dense and rich visual signal that helps to localize even distant objects, but only in the image domain. In this paper, we propose EagerMOT, a simple tracking formulation that eagerly integrates all available object observations from both sensor modalities to obtain a well-informed interpretation of the scene dynamics. Using images, we can identify distant incoming objects, while depth estimates allow for precise trajectory localization as soon as objects are within the depth-sensing range. With EagerMOT, we achieve state-of-the-art results across several MOT tasks on the KITTI and NuScenes datasets.

Benchmark results

Our current standings on KITTI for 2D MOT on the official leaderboard. For 2D MOTS, see this page. Our current standings on NuScenes for 3D MOT on the official leaderboard.

How to set up

Download official NuScenes and KITTI data if you plan on running tracking on them. Change the paths to that data in configs/local_variables.py.

Also set a path to a working directory for each dataset - all files produced by EagerMOT will be saved in that directory, e.g. fused instances, tracking results. A subfolder will be created for each dataset for each split, for example, if the working directory is /workspace/kitti, then /workspace/kitti/training and /workspace/kitti/testing will be used for each data split. The split to be run is also specified in local_variables.py. For NuScenes, the version of the dataset (VERSION = "v1.0-trainval") also has to be modified in run_tracking.py when switching between train/test.

If running on KITTI, download ego_motion.zip from the drive and unzip it into the KITTI working directory specified above (either training or testing). NuScenes data is already in world coordinates, so no need to ego motion estimates.

Download 3D and 2D detections, which ones to download depends on what you want to run:

KITTI 2D MOTSFusion detections/segmentations from https://github.com/tobiasfshr/MOTSFusion
KITTI 2D TrackRCNN detections/segmentations https://github.com/VisualComputingInstitute/TrackR-CNN
The 3D AB3DMOT detections can also be downloaded from the original source https://github.com/xinshuoweng/AB3DMOT, but their structure has not changed and is not compatible with this repo.
KITTI 3D PointGNN, NuScenes 3D CenterPoint, NuScenes 2D detections using an MMDetection model from the drive

Our benchmark results were achieved with PointGNN + (MOTSFusion+RRC) for KITTI and CenterPoint + MMDetectionCascade for NuScenes.

Unzip detections anywhere you want and provide the path to the root method folder in the inputs/utils.py file.

Set up a virtual environment

if using conda:

conda create --name <env> --file requirements_conda.txt

if using pip:

python3 -m venv env
source env/bin/activate
pip install -r requirements_pip.txt

How to run

See run_tracking.py for the code that launches tracking. Modify which function that file calls, depending on which dataset you want to run. See nearby comments for instructions.

if __name__ == "__main__":
    # choose which one to run, comment out the other one
    run_on_nuscenes()  
    run_on_kitti()

Start the script with $python run_tracking.py. Check the code itself to see what is being called. I recommend following function calls to explore how the code is structured.

Overall, the code was written to allow customization and easy experimentation instead of optimizing for performance.

Soon, I am looking to extract the data loading module and push my visualization code into a separate repo to use for other projects.

Please cite our paper if you find the code useful

@inproceedings{Kim21ICRA,
  title     = {EagerMOT: 3D Multi-Object Tracking via Sensor Fusion},
  author    = {Kim, Aleksandr, O\v{s}ep, Aljo\v{s}a and Leal-Taix{'e}, Laura},
  booktitle = {IEEE International Conference on Robotics and Automation (ICRA)},
  year      = {2021}
}

Official code for "EagerMOT: 3D Multi-Object Tracking via Sensor Fusion" [ICRA 2021]

Related tags

Overview

EagerMOT: 3D Multi-Object Tracking via Sensor Fusion

Read our ICRA 2021 paper here.

Check out the 3 minute video for the quick intro or the full presentation video for more details.

Abstract

Benchmark results

How to set up

Download 3D and 2D detections, which ones to download depends on what you want to run:

Set up a virtual environment

How to run

Please cite our paper if you find the code useful

Owner

Aleksandr Kim

Semantic Segmentation for Aerial Imagery using Convolutional Neural Network

This is the pytorch implementation for the paper: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation, which is accepted to ICCV2021.

A curated list of resources for Image and Video Deblurring

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.

for taichi voxel-challange event

Parametric Contrastive Learning (ICCV2021)

Plenoxels: Radiance Fields without Neural Networks, Code release WIP

Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr.

PyTorch implementation of Trust Region Policy Optimization

Fast, accurate and reliable software for algebraic CT reconstruction

Pytorch implementation of Learning Rate Dropout.

(EI 2022) Controllable Confidence-Based Image Denoising

Official PyTorch implementation of Retrieve in Style: Unsupervised Facial Feature Transfer and Retrieval.

This repository contains the code used for the implementation of the paper "Probabilistic Regression with HuberDistributions"

EfficientNetV2 implementation using PyTorch

Losslandscapetaxonomy - Taxonomizing local versus global structure in neural network loss landscapes

Generate Cartoon Images using Generative Adversarial Network

This is the repo for the paper `SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization'. (published in Bioinformatics'21)

An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English