Code repository for the paper "Tracking People with 3D Representations"

Related tags

Deep LearningT3DP
Overview

Tracking People with 3D Representations

Code repository for the paper "Tracking People with 3D Representations" (paper link) (project site).
Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik.
Neural Information Processing Systems (NeurIPS), 2021.

This code repository provides a code implementation for our paper T3DP, with installation, preparing datasets, and evaluating on datasets, and a demo code to run on any youtube videos.

Abstract : We present a novel approach for tracking multiple people in video. Unlike past approaches which employ 2D representations, we focus on using 3D representations of people, located in three-dimensional space. To this end, we develop a method, Human Mesh and Appearance Recovery (HMAR) which in addition to extracting the 3D geometry of the person as a SMPL mesh, also extracts appearance as a texture map on the triangles of the mesh. This serves as a 3D representation for appearance that is robust to viewpoint and pose changes. Given a video clip, we first detect bounding boxes corresponding to people, and for each one, we extract 3D appearance, pose, and location information using HMAR. These embedding vectors are then sent to a transformer, which performs spatio-temporal aggregation of the representations over the duration of the sequence. The similarity of the resulting representations is used to solve for associations that assigns each person to a tracklet. We evaluate our approach on the Posetrack, MuPoTs and AVA datasets. We find that 3D representations are more effective than 2D representations for tracking in these settings, and we obtain state-of-the-art performance.

Installation

We recommend creating a clean conda environment and install all dependencies. You can do this as follows:

conda env create -f _environment.yml

After the installation is complete you can activate the conda environment by running:

conda activate T3DP

Install PyOpenGL from this repository:

pip uninstall pyopengl
git clone https://github.com/mmatl/pyopengl.git
pip install ./pyopengl

Additionally, install Detectron2 from the official repository, if you need to run demo code on a local machine. We provide detections inside the _DATA folder, so for running the tracker on posetrack or mupots, you do not need to install Detectron2.

Download Data

We provide preprocessed files for PoseTrack and MuPoTs datasets (AVA files will be released soon!). Please download this folder and extract inside the main repository.

Training

To train the transformer model with posetrack data run,

python train_t3dp.py
--learning_rate 0.001
--lr_decay_epochs 10000,20000
--epochs 100000
--tags T3PO
--train_dataset posetrack_2018
--test_dataset posetrack_2018
--train_batch_size 32
--feature APK
--train

WANDB will create unique names for each run, and save the model names accordingly. Use this name for evaluation. We have also provided pretrained weights inside the _DATA folder.

Testing

Once the posetrack dataset is downloaded at "_DATA/Posetrack_2018/", run the following command to run our tracker on all validation videos.

python test_t3dp.py
--dataset "posetrack"
--dataset_path "_DATA/Posetrack_2018/"
--storage_folder "Videos_Final"
--render True
--save True

Evaluation

To evaluate the tracking performance on ID switches, MOTA, and IDF1 metrics, please run the following command.

python3 evaluate_t3dp.py out/Videos_Final/results/ t3dp posetrack

Demo

Please run the following command to run our method on a youtube video. This will download the youtube video from a given ID, and extract frames, run Detectron2, run HMAR and finally run our tracker and renders the video.

python3 demo.py

Results (Project site)

We evaluated our method on PoseTrack, MuPoTs and AVA datasets. Our results show significant improvements over the state-of-the-art methods on person tracking. For more results please visit our website.

Acknowledgements

Parts of the code are taken or adapted from the following repos:

Contact

Jathushan Rajasegaran - [email protected] or [email protected]
To ask questions or report issues, please open an issue on the issues tracker.
Discussions, suggestions and questions are welcome!

Citation

If you find this code useful for your research or the use data generated by our method, please consider citing the following paper:

@Inproceedings{rajasegaran2021tracking,
  title     = {Tracking People with 3D Representations},
  author    = {Rajasegaran, Jathushan and Pavlakos, Georgios and Kanazawa, Angjoo and Malik, Jitendra},
  Booktitle = {NeurIPS},
  year      = {2021}
}

Owner
Jathushan Rajasegaran
Jathushan Rajasegaran
[CVPR 2021] Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach

Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach This is the repo to host the dataset TextSeg and code for TexRNe

SHI Lab 174 Dec 19, 2022
Playable Video Generation

Playable Video Generation Playable Video Generation Willi Menapace, Stéphane Lathuilière, Sergey Tulyakov, Aliaksandr Siarohin, Elisa Ricci Paper: ArX

Willi Menapace 136 Dec 31, 2022
Code for ICCV2021 paper PARE: Part Attention Regressor for 3D Human Body Estimation

PARE: Part Attention Regressor for 3D Human Body Estimation [ICCV 2021] PARE: Part Attention Regressor for 3D Human Body Estimation, Muhammed Kocabas,

Muhammed Kocabas 277 Jan 03, 2023
A Python Reconnection Tool for alt:V

altv-reconnect What? It invokes a reconnect in the altV Client Dev Console. You get to determine when your local client should reconnect when developi

8 Jun 30, 2022
You can draw the corresponding bounding box into the image and save it according to the result file (txt format) run by the tracker.

You can draw the corresponding bounding box into the image and save it according to the result file (txt format) run by the tracker.

Huiyiqianli 42 Dec 06, 2022
Code for Temporally Abstract Partial Models

Code for Temporally Abstract Partial Models Accompanies the code for the experimental section of the paper: Temporally Abstract Partial Models, Khetar

DeepMind 19 Jul 13, 2022
Company clustering with K-means/GMM and visualization with PCA, t-SNE, using SSAN relation extraction

RE results graph visualization and company clustering Installation pip install -r requirements.txt python -m nltk.downloader stopwords python3.7 main.

Jieun Han 1 Oct 06, 2022
Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance This is the codebase for video-based human motion reconstruction in human-mot

Jiachen Xu 5 Jul 14, 2022
A trusty face recognition research platform developed by Tencent Youtu Lab

Introduction TFace: A trusty face recognition research platform developed by Tencent Youtu Lab. It provides a high-performance distributed training fr

Tencent 956 Jan 01, 2023
Repo for EchoVPR: Echo State Networks for Visual Place Recognition

EchoVPR Repo for EchoVPR: Echo State Networks for Visual Place Recognition Currently under development Dirs: data: pre-collected hidden representation

Anil Ozdemir 4 Oct 04, 2022
This is the repo for Uncertainty Quantification 360 Toolkit.

UQ360 The Uncertainty Quantification 360 (UQ360) toolkit is an open-source Python package that provides a diverse set of algorithms to quantify uncert

International Business Machines 207 Dec 30, 2022
This is a repository for a No-Code object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operating systems.

OpenVINO Inference API This is a repository for an object detection inference API using the OpenVINO. It's supported on both Windows and Linux Operati

BMW TechOffice MUNICH 68 Nov 24, 2022
FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data. Flexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well i

0 Sep 06, 2022
Code for Domain Adaptive Video Segmentation via Temporal Consistency Regularization in ICCV 2021

Domain Adaptive Video Segmentation via Temporal Consistency Regularization Updates 08/2021: check out our domain adaptation for sematic segmentation p

36 Dec 12, 2022
A curated list of resources for Image and Video Deblurring

A curated list of resources for Image and Video Deblurring

Subeesh Vasu 1.7k Jan 01, 2023
Deep Sea Treasure Environment for Multi-Objective Optimization Research

DeepSeaTreasure Environment Installation In order to get started with this environment, you can install it using the following command: python3 -m pip

imec IDLab 6 Nov 14, 2022
CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images

Code and result about CCAFNet(IEEE TMM) 'CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images' IEE

zyrant丶 14 Dec 29, 2021
FaceQgen: Semi-Supervised Deep Learning for Face Image Quality Assessment

FaceQgen FaceQgen: Semi-Supervised Deep Learning for Face Image Quality Assessment This repository is based on the paper: "FaceQgen: Semi-Supervised D

Javier Hernandez-Ortega 3 Aug 04, 2022
Official implementation of VQ-Diffusion

Official implementation of VQ-Diffusion: Vector Quantized Diffusion Model for Text-to-Image Synthesis

Microsoft 592 Jan 03, 2023
Complete-IoU (CIoU) Loss and Cluster-NMS for Object Detection and Instance Segmentation (YOLACT)

Complete-IoU Loss and Cluster-NMS for Improving Object Detection and Instance Segmentation. Our paper is accepted by IEEE Transactions on Cybernetics

290 Dec 25, 2022