Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

Overview

PanopticStudio Toolbox

This repository has a toolbox to download, process, and visualize the Panoptic Studio (Panoptic) data.

Note:

  • Sep-21-2020: Currently our server is offline due to the power outage in the CMU Campus, and COVID-19 makes it difficult to access the server room. We will fix the issue as soon as possible. 
  • Sep-30-2020: Unfortunately, we found that our server has been broken and we are replacing it now. Please wait a couple of more weeks.
  • Oct-5-2020: Our server is back and online now!
  • May-18-2021: Currently our server is offline due to our server maintenance. Hopefully it will be back online in this week.  

Quick start guide

Follow these steps to set up a simple example:

1. Check out the codebase

git clone https://github.com/CMU-Perceptual-Computing-Lab/panoptic-toolbox
cd panoptic-toolbox

2. Download a sample data and other data

To download a dataset, named "171204_pose1_sample" in this example, run the following script.

./scripts/getData.sh 171204_pose1_sample

This bash script requires curl or wget.

This script will create a folder "./171204_pose1_sample" and download the following files.

  • 171204_pose1_sample/hdVideos/hd_00_XX.mp4 #synchronized HD video files (31 views)
  • 171204_pose1_sample/vgaVideos/KINECTNODE%d/vga_XX_XX.mp4 #synchrponized VGA video files (480 views)
  • 171204_pose1_sample/calibration_171204_pose1_sample.json #calibration files
  • 171204_pose1_sample/hdPose3d_stage1_coco19.tar #3D Body Keypoint Data (coco19 keypoint definition)
  • 171204_pose1_sample/hdFace3d.tar #3D Face Keypoint Data
  • 171204_pose1_sample/hdHand3d.tar #3D Hand Keypoint Data

Note that this sample example currently does not have VGA videos.

You can also download any other seqeunce through this script. Just use the the name of the target sequence: instead of the "171204_pose1panopticHD". r example,

./scripts/getData.sh 171204_pose1

for the full version of 171204_pose1 sequence:. You can also specify the number of videospanopticHDnt to donwload.

./scripts/getData.sh (sequenceName) (VGA_Video_Number) (HD_Video_Number)

For example, the following command will download 240 vga videos and 10 videos.

./scripts/getData.sh 171204_pose1_sample 240 10

Note that we have sorted the VGA camera order so that you download uniformly distributed view.

3. Downloading All Available Sequences

You can find the list of currently available sequences in the following link:

List of released sequences (ver1.2)

Downloading all of them (including videos) may take a long time, but downloading 3D keypoint files (body+face+hand upon their availability) should be "relatively" quick.

You can use the following script to download currently available sequences (ver 1.2):

./scripts/getDB_panopticHD_ver1_2.sh

The default setting is not downloading any videos. Feel free to change the "vgaVideoNum" and "hdVideoNum" in the script to other numbers if you also want to download videos.

You can see the example videos and other information of each sequence: in our website: Browsing dataset.

Check the 3D viewer in each sequence: page where you can visualize 3D skeletons in your web browser. For example: http://domedb.perception.cs.cmu.edu/panopticHDpose1.html

4. Extract the images & 3D keypoint data

This step requires ffmpeg.

./scripts/extractAll.sh 171204_pose1_sample

This will extract images, for example 171204_pose1_sample/hdImgs/00_00/00_00_00000000.jpg, and the corresponding 3D skeleton data, for example 171204_pose1_sample/hdPose3d_stage1_coco19/body3DScene_00000000.json.

extractAll.sh is a simple script that combines the following set of commands (you shouldn't need to run these again):

cd 171204_pose1_sample
../scripts/vgaImgsExtractor.sh # PNG files from VGA video (25 fps)
../scripts/hdImgsExtractor.sh # PNG files from HD video (29.97 fps)
tar -xf vgaPose3d_stage1.tar # Extract skeletons at VGA framerate
tar -xf hdPose3d_stage1.tar # Extract skeletons for HD
cd ..

5. Run demo programs

Python

This codes require numpy, matplotlib.

Visualizing 3D keypoints (body, face, hand):

cd python
jupyter notebook demo_3Dkeypoints_3dview.ipynb

The result should look like this.

Reprojecting 3D keypoints (body, face, hand) on a selected HD view:

cd python
jupyter notebook demo_3Dkeypoints_reprojection_hd.ipynb

The result should look like this.

This codes require numpy, matplotlib.

Visualizing 3D keypoints (body, face, hand):

cd python
jupyter notebook demo_3Dkeypoints_3dview.ipynb

The result should look like this.

Reprojecting 3D keypoints (body, face, hand) on a selected HD view:

cd python
jupyter notebook demo_3Dkeypoints_reprojection_hd.ipynb

The result should look like this.

Python + OpengGL

  • This codes require pyopengl.

  • Visualizing 3D keypoints (body, face, hand):

python glViewer.py

Matlab

Note: Matlab code is outdated, and does not handle 3D keypoint outputs (coco19 body, face, hand). Please see this code only for reference. We will update this later.

Matlab example (outdated):

>>> cd matlab
>>> demo

Skeleton Output Format

We reconstruct 3D skeleton of people using the method of Joo et al. 2018.

The output of each frame is written in a json file. For example,

{ "version": 0.7, 
"univTime" :53541.542,
"fpsType" :"hd_29_97",
"bodies" :
[
{ "id": 0,
"joints19": [-19.4528, -146.612, 1.46159, 0.724274, -40.4564, -163.091, -0.521563, 0.575897, -14.9749, -91.0176, 4.24329, 0.361725, -19.2473, -146.679, -16.1136, 0.643555, -14.7958, -118.804, -20.6738, 0.619599, -22.611, -93.8793, -17.7834, 0.557953, -12.3267, -91.5465, -6.55368, 0.353241, -12.6556, -47.0963, -4.83599, 0.455566, -10.8069, -8.31645, -4.20936, 0.501312, -20.2358, -147.348, 19.1843, 0.628022, -13.1145, -120.269, 28.0371, 0.63559, -20.1037, -94.3607, 30.0809, 0.625916, -17.623, -90.4888, 15.0403, 0.327759, -17.3973, -46.9311, 15.9659, 0.419586, -13.1719, -7.60601, 13.4749, 0.519653, -38.7164, -166.851, -3.25917, 0.46228, -28.7043, -167.333, -7.15903, 0.523224, -39.0433, -166.677, 2.55916, 0.395965, -30.0718, -167.264, 8.18371, 0.510041]
}
] }

Here, each subject has the following values.

id: a unique subject index within a sequence:. Skeletons with the same id across time represent temporally associated moving skeletons (an individual). However, the same person may have multiple ids joints19: 19 3D joint locations, formatted as [x1,y1,z1,c1,x2,y2,z2,c2,...] where each c ispanopticHDjoint confidence score.

The 3D skeletons have the following keypoint order:

0: Neck
1: Nose
2: BodyCenter (center of hips)
3: lShoulder
4: lElbow
5: lWrist,
6: lHip
7: lKnee
8: lAnkle
9: rShoulder
10: rElbow
11: rWrist
12: rHip
13: rKnee
14: rAnkle
15: lEye
16: lEar
17: rEye
18: rEar

Note that this is different from OpenPose output order, although our method is based on it.

Note that we used to use an old format (named mpi15 as described in our outdated document), but we do not this format anymore.

KinopticStudio Toolbox

Kinoptic Studio is a subsystem of Panoptic Studio, which is composed of 10 Kinect2 sensors. Please see: README_kinoptic

Panoptic 3D PointCloud DB ver.1

You can download all sequences included in our 3D PointCloud DB ver.1 using the following script:

./scripts/getDB_ptCloud_ver1.sh

Haggling DB

We have released the processed data for the haggling sequence. Please see Social Signal Processing repository.

Teaser Image

License

Panoptic Studio Dataset is freely available for non-commercial and research purpose only.

References

By using the dataset, you agree to cite at least one of the following papers.

@inproceedings{Joo_2015_ICCV,
author = {Joo, Hanbyul and Liu, Hao and Tan, Lei and Gui, Lin and Nabbe, Bart and Matthews, Iain and Kanade, Takeo and Nobuhara, Shohei and Sheikh, Yaser},
title = {Panoptic Studio: A Massively Multiview System for Social Motion Capture},
booktitle = {ICCV},
year = {2015} }

@inproceedings{Joo_2017_TPAMI,
title={Panoptic Studio: A Massively Multiview System for Social Interaction Capture},
author={Joo, Hanbyul and Simon, Tomas and Li, Xulong and Liu, Hao and Tan, Lei and Gui, Lin and Banerjee, Sean and Godisart, Timothy Scott and Nabbe, Bart and Matthews, Iain and Kanade, Takeo and Nobuhara, Shohei and Sheikh, Yaser},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
year={2017} }

@inproceedings{Simon_2017_CVPR,
title={Hand Keypoint Detection in Single Images using Multiview Bootstrapping},
author={Simon, Tomas and Joo, Hanbyul and Sheikh, Yaser},
journal={CVPR},
year={2017} }

@inproceedings{joo2019ssp,
  title={Towards Social Artificial Intelligence: Nonverbal Social Signal Prediction in A Triadic Interaction},
  author={Joo, Hanbyul and Simon, Tomas and Cikara, Mina and Sheikh, Yaser},
  booktitle={CVPR},
  year={2019}
}



'Aligned mixture of latent dynamical systems' (amLDS) for stimulus decoding probabilistic manifold alignment across animals. P. Herrero-Vidal et al. NeurIPS 2021 code.

Across-animal odor decoding by probabilistic manifold alignment (NeurIPS 2021) This repository is the official implementation of aligned mixture of la

Pedro Herrero-Vidal 3 Jul 12, 2022
Scripts used to make and evaluate OpenAlex's concept tagging model

openalex-concept-tagging This repository contains all of the code for getting the concept tagger up and running. To learn more about where this model

OurResearch 18 Dec 09, 2022
Implementation of "Debiasing Item-to-Item Recommendations With Small Annotated Datasets" (RecSys '20)

Debiasing Item-to-Item Recommendations With Small Annotated Datasets This is the code for our RecSys '20 paper. Other materials can be found here: Ful

Microsoft 34 Aug 10, 2022
Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression", TIP 2020

Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multil

Xuefeng 5 Jan 15, 2022
A PyTorch-based Semi-Supervised Learning (SSL) Codebase for Pixel-wise (Pixel) Vision Tasks

PixelSSL is a PyTorch-based semi-supervised learning (SSL) codebase for pixel-wise (Pixel) vision tasks. The purpose of this project is to promote the

Zhanghan Ke 255 Dec 11, 2022
PyTorch implementation of InstaGAN: Instance-aware Image-to-Image Translation

InstaGAN: Instance-aware Image-to-Image Translation Warning: This repo contains a model which has potential ethical concerns. Remark that the task of

Sangwoo Mo 827 Dec 29, 2022
Multiple paper open-source codes of the Microsoft Research Asia DKI group

📫 Paper Code Collection (MSRA DKI Group) This repo hosts multiple open-source codes of the Microsoft Research Asia DKI Group. You could find the corr

Microsoft 249 Jan 08, 2023
Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

PLOME:Pre-training with Misspelled Knowledge for Chinese Spelling Correction (ACL2021) This repository provides the code and data of the work in ACL20

197 Nov 26, 2022
Keras Image Embeddings using Contrastive Loss

Image to Embedding projection in vector space. Implementation in keras and tensorflow of batch all triplet loss for one-shot/few-shot learning.

Shravan Anand K 5 Mar 21, 2022
The repo for the paper "I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection".

I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection Updates | Introduction | Results | Usage | Citation |

33 Jan 05, 2023
【CVPR 2021, Variational Inference Framework, PyTorch】 From Rain Generation to Rain Removal

From Rain Generation to Rain Removal (CVPR2021) Hong Wang, Zongsheng Yue, Qi Xie, Qian Zhao, Yefeng Zheng, and Deyu Meng [PDF&&Supplementary Material]

Hong Wang 48 Nov 23, 2022
Deep Learning Visuals contains 215 unique images divided in 23 categories

Deep Learning Visuals contains 215 unique images divided in 23 categories (some images may appear in more than one category). All the images were originally published in my book "Deep Learning with P

Daniel Voigt Godoy 1.3k Dec 28, 2022
PINN(s): Physics-Informed Neural Network(s) for von Karman vortex street

PINN(s): Physics-Informed Neural Network(s) for von Karman vortex street This is

ShotaDEGUCHI 2 Apr 18, 2022
Simulation of moving particles under microscopic imaging

Simulation of moving particles under microscopic imaging Install scipy numpy scikit-image tiffile Run python simulation.py Read result https://imagej

Zehao Wang 2 Dec 14, 2021
scAR (single-cell Ambient Remover) is a package for data denoising in single-cell omics.

scAR scAR (single cell Ambient Remover) is a package for denoising multiple single cell omics data. It can be used for multiple tasks, such as, sgRNA

19 Nov 28, 2022
Optimized code based on M2 for faster image captioning training

Transformer Captioning This repository contains the code for Transformer-based image captioning. Based on meshed-memory-transformer, we further optimi

lyricpoem 16 Dec 16, 2022
This is my research project for the Irving Center for Cancer Dynamics/Azizi Lab, Columbia University.

bayesian_uncertainty This is my research project for the Irving Center for Cancer Dynamics/Azizi Lab, Columbia University. In this project I build a s

Max David Gupta 1 Feb 13, 2022
Azion the best solution of Edge Computing in the world.

Azion Edge Function docker action Create or update an Edge Functions on Azion Edge Nodes. The domain name is the key for decision to a create or updat

8 Jul 16, 2022
Official implementation of the paper Momentum Capsule Networks (MoCapsNet)

Momentum Capsule Network Official implementation of the paper Momentum Capsule Networks (MoCapsNet). Abstract Capsule networks are a class of neural n

8 Oct 20, 2022
Over9000 optimizer

Optimizers and tests Every result is avg of 20 runs. Dataset LR Schedule Imagenette size 128, 5 epoch Imagewoof size 128, 5 epoch Adam - baseline OneC

Mikhail Grankin 405 Nov 27, 2022