Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

Overview

PanopticStudio Toolbox

This repository has a toolbox to download, process, and visualize the Panoptic Studio (Panoptic) data.

Note:

  • Sep-21-2020: Currently our server is offline due to the power outage in the CMU Campus, and COVID-19 makes it difficult to access the server room. We will fix the issue as soon as possible. 
  • Sep-30-2020: Unfortunately, we found that our server has been broken and we are replacing it now. Please wait a couple of more weeks.
  • Oct-5-2020: Our server is back and online now!
  • May-18-2021: Currently our server is offline due to our server maintenance. Hopefully it will be back online in this week.  

Quick start guide

Follow these steps to set up a simple example:

1. Check out the codebase

git clone https://github.com/CMU-Perceptual-Computing-Lab/panoptic-toolbox
cd panoptic-toolbox

2. Download a sample data and other data

To download a dataset, named "171204_pose1_sample" in this example, run the following script.

./scripts/getData.sh 171204_pose1_sample

This bash script requires curl or wget.

This script will create a folder "./171204_pose1_sample" and download the following files.

  • 171204_pose1_sample/hdVideos/hd_00_XX.mp4 #synchronized HD video files (31 views)
  • 171204_pose1_sample/vgaVideos/KINECTNODE%d/vga_XX_XX.mp4 #synchrponized VGA video files (480 views)
  • 171204_pose1_sample/calibration_171204_pose1_sample.json #calibration files
  • 171204_pose1_sample/hdPose3d_stage1_coco19.tar #3D Body Keypoint Data (coco19 keypoint definition)
  • 171204_pose1_sample/hdFace3d.tar #3D Face Keypoint Data
  • 171204_pose1_sample/hdHand3d.tar #3D Hand Keypoint Data

Note that this sample example currently does not have VGA videos.

You can also download any other seqeunce through this script. Just use the the name of the target sequence: instead of the "171204_pose1panopticHD". r example,

./scripts/getData.sh 171204_pose1

for the full version of 171204_pose1 sequence:. You can also specify the number of videospanopticHDnt to donwload.

./scripts/getData.sh (sequenceName) (VGA_Video_Number) (HD_Video_Number)

For example, the following command will download 240 vga videos and 10 videos.

./scripts/getData.sh 171204_pose1_sample 240 10

Note that we have sorted the VGA camera order so that you download uniformly distributed view.

3. Downloading All Available Sequences

You can find the list of currently available sequences in the following link:

List of released sequences (ver1.2)

Downloading all of them (including videos) may take a long time, but downloading 3D keypoint files (body+face+hand upon their availability) should be "relatively" quick.

You can use the following script to download currently available sequences (ver 1.2):

./scripts/getDB_panopticHD_ver1_2.sh

The default setting is not downloading any videos. Feel free to change the "vgaVideoNum" and "hdVideoNum" in the script to other numbers if you also want to download videos.

You can see the example videos and other information of each sequence: in our website: Browsing dataset.

Check the 3D viewer in each sequence: page where you can visualize 3D skeletons in your web browser. For example: http://domedb.perception.cs.cmu.edu/panopticHDpose1.html

4. Extract the images & 3D keypoint data

This step requires ffmpeg.

./scripts/extractAll.sh 171204_pose1_sample

This will extract images, for example 171204_pose1_sample/hdImgs/00_00/00_00_00000000.jpg, and the corresponding 3D skeleton data, for example 171204_pose1_sample/hdPose3d_stage1_coco19/body3DScene_00000000.json.

extractAll.sh is a simple script that combines the following set of commands (you shouldn't need to run these again):

cd 171204_pose1_sample
../scripts/vgaImgsExtractor.sh # PNG files from VGA video (25 fps)
../scripts/hdImgsExtractor.sh # PNG files from HD video (29.97 fps)
tar -xf vgaPose3d_stage1.tar # Extract skeletons at VGA framerate
tar -xf hdPose3d_stage1.tar # Extract skeletons for HD
cd ..

5. Run demo programs

Python

This codes require numpy, matplotlib.

Visualizing 3D keypoints (body, face, hand):

cd python
jupyter notebook demo_3Dkeypoints_3dview.ipynb

The result should look like this.

Reprojecting 3D keypoints (body, face, hand) on a selected HD view:

cd python
jupyter notebook demo_3Dkeypoints_reprojection_hd.ipynb

The result should look like this.

This codes require numpy, matplotlib.

Visualizing 3D keypoints (body, face, hand):

cd python
jupyter notebook demo_3Dkeypoints_3dview.ipynb

The result should look like this.

Reprojecting 3D keypoints (body, face, hand) on a selected HD view:

cd python
jupyter notebook demo_3Dkeypoints_reprojection_hd.ipynb

The result should look like this.

Python + OpengGL

  • This codes require pyopengl.

  • Visualizing 3D keypoints (body, face, hand):

python glViewer.py

Matlab

Note: Matlab code is outdated, and does not handle 3D keypoint outputs (coco19 body, face, hand). Please see this code only for reference. We will update this later.

Matlab example (outdated):

>>> cd matlab
>>> demo

Skeleton Output Format

We reconstruct 3D skeleton of people using the method of Joo et al. 2018.

The output of each frame is written in a json file. For example,

{ "version": 0.7, 
"univTime" :53541.542,
"fpsType" :"hd_29_97",
"bodies" :
[
{ "id": 0,
"joints19": [-19.4528, -146.612, 1.46159, 0.724274, -40.4564, -163.091, -0.521563, 0.575897, -14.9749, -91.0176, 4.24329, 0.361725, -19.2473, -146.679, -16.1136, 0.643555, -14.7958, -118.804, -20.6738, 0.619599, -22.611, -93.8793, -17.7834, 0.557953, -12.3267, -91.5465, -6.55368, 0.353241, -12.6556, -47.0963, -4.83599, 0.455566, -10.8069, -8.31645, -4.20936, 0.501312, -20.2358, -147.348, 19.1843, 0.628022, -13.1145, -120.269, 28.0371, 0.63559, -20.1037, -94.3607, 30.0809, 0.625916, -17.623, -90.4888, 15.0403, 0.327759, -17.3973, -46.9311, 15.9659, 0.419586, -13.1719, -7.60601, 13.4749, 0.519653, -38.7164, -166.851, -3.25917, 0.46228, -28.7043, -167.333, -7.15903, 0.523224, -39.0433, -166.677, 2.55916, 0.395965, -30.0718, -167.264, 8.18371, 0.510041]
}
] }

Here, each subject has the following values.

id: a unique subject index within a sequence:. Skeletons with the same id across time represent temporally associated moving skeletons (an individual). However, the same person may have multiple ids joints19: 19 3D joint locations, formatted as [x1,y1,z1,c1,x2,y2,z2,c2,...] where each c ispanopticHDjoint confidence score.

The 3D skeletons have the following keypoint order:

0: Neck
1: Nose
2: BodyCenter (center of hips)
3: lShoulder
4: lElbow
5: lWrist,
6: lHip
7: lKnee
8: lAnkle
9: rShoulder
10: rElbow
11: rWrist
12: rHip
13: rKnee
14: rAnkle
15: lEye
16: lEar
17: rEye
18: rEar

Note that this is different from OpenPose output order, although our method is based on it.

Note that we used to use an old format (named mpi15 as described in our outdated document), but we do not this format anymore.

KinopticStudio Toolbox

Kinoptic Studio is a subsystem of Panoptic Studio, which is composed of 10 Kinect2 sensors. Please see: README_kinoptic

Panoptic 3D PointCloud DB ver.1

You can download all sequences included in our 3D PointCloud DB ver.1 using the following script:

./scripts/getDB_ptCloud_ver1.sh

Haggling DB

We have released the processed data for the haggling sequence. Please see Social Signal Processing repository.

Teaser Image

License

Panoptic Studio Dataset is freely available for non-commercial and research purpose only.

References

By using the dataset, you agree to cite at least one of the following papers.

@inproceedings{Joo_2015_ICCV,
author = {Joo, Hanbyul and Liu, Hao and Tan, Lei and Gui, Lin and Nabbe, Bart and Matthews, Iain and Kanade, Takeo and Nobuhara, Shohei and Sheikh, Yaser},
title = {Panoptic Studio: A Massively Multiview System for Social Motion Capture},
booktitle = {ICCV},
year = {2015} }

@inproceedings{Joo_2017_TPAMI,
title={Panoptic Studio: A Massively Multiview System for Social Interaction Capture},
author={Joo, Hanbyul and Simon, Tomas and Li, Xulong and Liu, Hao and Tan, Lei and Gui, Lin and Banerjee, Sean and Godisart, Timothy Scott and Nabbe, Bart and Matthews, Iain and Kanade, Takeo and Nobuhara, Shohei and Sheikh, Yaser},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2017} }

@inproceedings{Simon_2017_CVPR,
title={Hand Keypoint Detection in Single Images using Multiview Bootstrapping},
author={Simon, Tomas and Joo, Hanbyul and Sheikh, Yaser},
journal={CVPR},
year={2017} }

@inproceedings{joo2019ssp,
  title={Towards Social Artificial Intelligence: Nonverbal Social Signal Prediction in A Triadic Interaction},
  author={Joo, Hanbyul and Simon, Tomas and Cikara, Mina and Sheikh, Yaser},
  booktitle={CVPR},
  year={2019}
}



190 Jan 03, 2023
Official Pytorch implementation of "Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes", CVPR 2022

Learning to Estimate Robust 3D Human Mesh from In-the-Wild Crowded Scenes / 3DCrowdNet News 💪 3DCrowdNet achieves the state-of-the-art accuracy on 3D

Hongsuk Choi 113 Dec 21, 2022
Training BERT with Compute/Time (Academic) Budget

Training BERT with Compute/Time (Academic) Budget This repository contains scripts for pre-training and finetuning BERT-like models with limited time

Intel Labs 263 Jan 07, 2023
Video-based open-world segmentation

UVO_Challenge Team Alpes_runner Solutions This is an official repo for our UVO Challenge solutions for Image/Video-based open-world segmentation. Our

Yuming Du 84 Dec 22, 2022
Weakly Supervised 3D Object Detection from Point Cloud with Only Image Level Annotation

SCCKTIM Weakly Supervised 3D Object Detection from Point Cloud with Only Image-Level Annotation Our code will be available soon. The class knowledge t

1 Nov 12, 2021
ByteTrack: Multi-Object Tracking by Associating Every Detection Box

ByteTrack ByteTrack is a simple, fast and strong multi-object tracker. ByteTrack: Multi-Object Tracking by Associating Every Detection Box Yifu Zhang,

Yifu Zhang 2.9k Jan 04, 2023
Implementation for the EMNLP 2021 paper "Interactive Machine Comprehension with Dynamic Knowledge Graphs".

Interactive Machine Comprehension with Dynamic Knowledge Graphs Implementation for the EMNLP 2021 paper. Dependencies apt-get -y update apt-get instal

Xingdi (Eric) Yuan 19 Aug 23, 2022
Hyperopt for solving CIFAR-100 with a convolutional neural network (CNN) built with Keras and TensorFlow, GPU backend

Hyperopt for solving CIFAR-100 with a convolutional neural network (CNN) built with Keras and TensorFlow, GPU backend This project acts as both a tuto

Guillaume Chevalier 103 Jul 22, 2022
Cookiecutter PyTorch Lightning

Cookiecutter PyTorch Lightning Instructions # install cookiecutter pip install cookiecutter

Mazen 8 Nov 06, 2022
BaseCls BaseCls 是一个基于 MegEngine 的预训练模型库,帮助大家挑选或训练出更适合自己科研或者业务的模型结构

BaseCls BaseCls 是一个基于 MegEngine 的预训练模型库,帮助大家挑选或训练出更适合自己科研或者业务的模型结构。 文档地址:https://basecls.readthedocs.io 安装 安装环境 BaseCls 需要 Python = 3.6。 BaseCls 依赖 M

MEGVII Research 28 Dec 23, 2022
PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Long Short-Term Transformer for Online Action Detection Introduction This is a PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short

77 Dec 16, 2022
This is a deep learning-based method to segment deep brain structures and a brain mask from T1 weighted MRI.

DBSegment This tool generates 30 deep brain structures segmentation, as well as a brain mask from T1-Weighted MRI. The whole procedure should take ~1

Luxembourg Neuroimaging (Platform OpNeuroImg) 2 Oct 25, 2022
People log into different sites every day to get information and browse through these sites one by one

HyperLink People log into different sites every day to get information and browse through these sites one by one. And they are exposed to advertisemen

0 Feb 17, 2022
An Official Repo of CVPR '20 "MSeg: A Composite Dataset for Multi-Domain Segmentation"

This is the code for the paper: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation (CVPR 2020, Official Repo) [CVPR PDF] [Journal PDF] J

226 Nov 05, 2022
[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

Stable Head Pose Estimation and Landmark Regression via 3D Dense Face Reconstruction Reimplementation of (ECCV 2020) Towards Fast, Accurate and Stable

Remilia Scarlet 221 Dec 30, 2022
Efficient Training of Audio Transformers with Patchout

PaSST: Efficient Training of Audio Transformers with Patchout This is the implementation for Efficient Training of Audio Transformers with Patchout Pa

165 Dec 26, 2022
Code for Motion Representations for Articulated Animation paper

Motion Representations for Articulated Animation This repository contains the source code for the CVPR'2021 paper Motion Representations for Articulat

Snap Research 851 Jan 09, 2023
A hifiasm fork for metagenome assembly using Hifi reads.

hifiasm_meta - de novo metagenome assembler, based on hifiasm, a haplotype-resolved de novo assembler for PacBio Hifi reads.

44 Jul 10, 2022
Lightwood is Legos for Machine Learning.

Lightwood is like Legos for Machine Learning. A Pytorch based framework that breaks down machine learning problems into smaller blocks that can be glu

MindsDB Inc 312 Jan 08, 2023
A Gura parser implementation for Python

Gura Python parser This repository contains the implementation of a Gura (compliant with version 1.0.0) format parser in Python. Installation pip inst

Gura Config Lang 19 Jan 25, 2022