Code repository for the paper "Tracking People with 3D Representations"

Related tags

Deep LearningT3DP
Overview

Tracking People with 3D Representations

Code repository for the paper "Tracking People with 3D Representations" (paper link) (project site).
Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, Jitendra Malik.
Neural Information Processing Systems (NeurIPS), 2021.

This code repository provides a code implementation for our paper T3DP, with installation, preparing datasets, and evaluating on datasets, and a demo code to run on any youtube videos.

Abstract : We present a novel approach for tracking multiple people in video. Unlike past approaches which employ 2D representations, we focus on using 3D representations of people, located in three-dimensional space. To this end, we develop a method, Human Mesh and Appearance Recovery (HMAR) which in addition to extracting the 3D geometry of the person as a SMPL mesh, also extracts appearance as a texture map on the triangles of the mesh. This serves as a 3D representation for appearance that is robust to viewpoint and pose changes. Given a video clip, we first detect bounding boxes corresponding to people, and for each one, we extract 3D appearance, pose, and location information using HMAR. These embedding vectors are then sent to a transformer, which performs spatio-temporal aggregation of the representations over the duration of the sequence. The similarity of the resulting representations is used to solve for associations that assigns each person to a tracklet. We evaluate our approach on the Posetrack, MuPoTs and AVA datasets. We find that 3D representations are more effective than 2D representations for tracking in these settings, and we obtain state-of-the-art performance.

Installation

We recommend creating a clean conda environment and install all dependencies. You can do this as follows:

conda env create -f _environment.yml

After the installation is complete you can activate the conda environment by running:

conda activate T3DP

Install PyOpenGL from this repository:

pip uninstall pyopengl
git clone https://github.com/mmatl/pyopengl.git
pip install ./pyopengl

Additionally, install Detectron2 from the official repository, if you need to run demo code on a local machine. We provide detections inside the _DATA folder, so for running the tracker on posetrack or mupots, you do not need to install Detectron2.

Download Data

We provide preprocessed files for PoseTrack and MuPoTs datasets (AVA files will be released soon!). Please download this folder and extract inside the main repository.

Training

To train the transformer model with posetrack data run,

python train_t3dp.py
--learning_rate 0.001
--lr_decay_epochs 10000,20000
--epochs 100000
--tags T3PO
--train_dataset posetrack_2018
--test_dataset posetrack_2018
--train_batch_size 32
--feature APK
--train

WANDB will create unique names for each run, and save the model names accordingly. Use this name for evaluation. We have also provided pretrained weights inside the _DATA folder.

Testing

Once the posetrack dataset is downloaded at "_DATA/Posetrack_2018/", run the following command to run our tracker on all validation videos.

python test_t3dp.py
--dataset "posetrack"
--dataset_path "_DATA/Posetrack_2018/"
--storage_folder "Videos_Final"
--render True
--save True

Evaluation

To evaluate the tracking performance on ID switches, MOTA, and IDF1 metrics, please run the following command.

python3 evaluate_t3dp.py out/Videos_Final/results/ t3dp posetrack

Demo

Please run the following command to run our method on a youtube video. This will download the youtube video from a given ID, and extract frames, run Detectron2, run HMAR and finally run our tracker and renders the video.

python3 demo.py

Results (Project site)

We evaluated our method on PoseTrack, MuPoTs and AVA datasets. Our results show significant improvements over the state-of-the-art methods on person tracking. For more results please visit our website.

Acknowledgements

Parts of the code are taken or adapted from the following repos:

Contact

Jathushan Rajasegaran - [email protected] or [email protected]
To ask questions or report issues, please open an issue on the issues tracker.
Discussions, suggestions and questions are welcome!

Citation

If you find this code useful for your research or the use data generated by our method, please consider citing the following paper:

@Inproceedings{rajasegaran2021tracking,
  title     = {Tracking People with 3D Representations},
  author    = {Rajasegaran, Jathushan and Pavlakos, Georgios and Kanazawa, Angjoo and Malik, Jitendra},
  Booktitle = {NeurIPS},
  year      = {2021}
}

Owner
Jathushan Rajasegaran
Jathushan Rajasegaran
adversarial_multi_armed_bandit_variable_plays

Adversarial Multi-Armed Bandit with Variable Plays This code is for paper: Adversarial Online Learning with Variable Plays in the Evasion-and-Pursuit

Yiyang Wang 1 Oct 28, 2021
TensorFlow 2 AI/ML library wrapper for openFrameworks

ofxTensorFlow2 This is an openFrameworks addon for the TensorFlow 2 ML (Machine Learning) library

Center for Art and Media Karlsruhe 96 Dec 31, 2022
Differential rendering based motion capture blender project.

TraceArmature Summary TraceArmature is currently a set of python scripts that allow for high fidelity motion capture through the use of AI pose estima

William Rodriguez 4 May 27, 2022
Hardware accelerated, batchable and differentiable optimizers in JAX.

JAXopt Installation | Examples | References Hardware accelerated (GPU/TPU), batchable and differentiable optimizers in JAX. Installation JAXopt can be

Google 621 Jan 08, 2023
PyTorch implementation for the Neuro-Symbolic Sudoku Solver leveraging the power of Neural Logic Machines (NLM)

Neuro-Symbolic Sudoku Solver PyTorch implementation for the Neuro-Symbolic Sudoku Solver leveraging the power of Neural Logic Machines (NLM). Please n

Ashutosh Hathidara 60 Dec 10, 2022
A state-of-the-art semi-supervised method for image recognition

Mean teachers are better role models Paper ---- NIPS 2017 poster ---- NIPS 2017 spotlight slides ---- Blog post By Antti Tarvainen, Harri Valpola (The

Curious AI 1.4k Jan 06, 2023
Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"

Self-Supervised-MVS This repository is the official PyTorch implementation of our AAAI 2021 paper: "Self-supervised Multi-view Stereo via Effective Co

hongbin_xu 127 Jan 04, 2023
The source code of "SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation", accepted to WACV 2022.

SIDE: Center-based Stereo 3D Detector with Structure-aware Instance Depth Estimation The source code of our work "SIDE: Center-based Stereo 3D Detecto

10 Dec 18, 2022
Memory-Augmented Model Predictive Control

Memory-Augmented Model Predictive Control This repository hosts the source code for the journal article "Composing MPC with LQR and Neural Networks fo

Fangyu Wu 1 Jun 19, 2022
A face dataset generator with out-of-focus blur detection and dynamic interval adjustment.

A face dataset generator with out-of-focus blur detection and dynamic interval adjustment.

Yutian Liu 2 Jan 29, 2022
You Only 👀 One Sequence

You Only 👀 One Sequence TL;DR: We study the transferability of the vanilla ViT pre-trained on mid-sized ImageNet-1k to the more challenging COCO obje

Hust Visual Learning Team 666 Jan 03, 2023
Photographic Image Synthesis with Cascaded Refinement Networks - Pytorch Implementation

Photographic Image Synthesis with Cascaded Refinement Networks-Pytorch (https://arxiv.org/abs/1707.09405) This is a Pytorch implementation of cascaded

Soumya Tripathy 63 Mar 27, 2022
An ever-growing playground of notebooks showcasing CLIP's impressive zero-shot capabilities.

Playground for CLIP-like models Demo Colab Link GradCAM Visualization Naive Zero-shot Detection Smarter Zero-shot Detection Captcha Solver Changelog 2

Kevin Zakka 101 Dec 30, 2022
[NeurIPS'21] Projected GANs Converge Faster

[Project] [PDF] [Supplementary] [Talk] This repository contains the code for our NeurIPS 2021 paper "Projected GANs Converge Faster" by Axel Sauer, Ka

798 Jan 04, 2023
Official implementation for the paper: Permutation Invariant Graph Generation via Score-Based Generative Modeling

Permutation Invariant Graph Generation via Score-Based Generative Modeling This repo contains the official implementation for the paper Permutation In

64 Dec 29, 2022
[ACM MM 2021] Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation)

Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation) [arXiv] [paper] @inproceedings{hou2021multiview, title={Multiview

Yunzhong Hou 27 Dec 13, 2022
A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

Vikash Sehwag 65 Dec 19, 2022
Generating Videos with Scene Dynamics

Generating Videos with Scene Dynamics This repository contains an implementation of Generating Videos with Scene Dynamics by Carl Vondrick, Hamed Pirs

Carl Vondrick 706 Jan 04, 2023
An adaptive hierarchical energy management strategy for hybrid electric vehicles

An adaptive hierarchical energy management strategy This project contains the source code of an adaptive hierarchical EMS combining heuristic equivale

19 Dec 13, 2022
PyTorch reimplementation of minimal-hand (CVPR2020)

Minimal Hand Pytorch Unofficial PyTorch reimplementation of minimal-hand (CVPR2020). you can also find in youtube or bilibili bare hand youtube or bil

Hao Meng 228 Dec 29, 2022