A simple baseline for 3d human pose estimation in tensorflow. Presented at ICCV 17.

Last update: Jan 03, 2023

Overview

3d-pose-baseline

This is the code for the paper

Julieta Martinez, Rayat Hossain, Javier Romero, James J. Little. A simple yet effective baseline for 3d human pose estimation. In ICCV, 2017. https://arxiv.org/pdf/1705.03098.pdf.

The code in this repository was mostly written by Julieta Martinez, Rayat Hossain and Javier Romero.

We provide a strong baseline for 3d human pose estimation that also sheds light on the challenges of current approaches. Our model is lightweight and we strive to make our code transparent, compact, and easy-to-understand.

Dependencies

Python ≥ 3.5
cdflib
tensorflow 1.0 or later

First of all

Watch our video: https://youtu.be/Hmi3Pd9x1BE
Clone this repository

git clone https://github.com/una-dinosauria/3d-pose-baseline.git
cd 3d-pose-baseline
mkdir -p data/h36m/

Get the data

Go to http://vision.imar.ro/human3.6m/, log in, and download the D3 Positions files for subjects [1, 5, 6, 7, 8, 9, 11], and put them under the folder data/h36m. Your directory structure should look like this

src/
README.md
LICENCE
...
data/
  └── h36m/
    ├── Poses_D3_Positions_S1.tgz
    ├── Poses_D3_Positions_S11.tgz
    ├── Poses_D3_Positions_S5.tgz
    ├── Poses_D3_Positions_S6.tgz
    ├── Poses_D3_Positions_S7.tgz
    ├── Poses_D3_Positions_S8.tgz
    └── Poses_D3_Positions_S9.tgz

Now, move to the data folder, and uncompress all the data

cd data/h36m/
for file in *.tgz; do tar -xvzf $file; done

Finally, download the code-v1.2.zip file, unzip it, and copy the metadata.xml file under data/h36m/

Now, your data directory should look like this:

data/
  └── h36m/
    ├── metadata.xml
    ├── S1/
    ├── S11/
    ├── S5/
    ├── S6/
    ├── S7/
    ├── S8/
    └── S9/

There is one little fix we need to run for the data to have consistent names:

mv h36m/S1/MyPoseFeatures/D3_Positions/TakingPhoto.cdf \
   h36m/S1/MyPoseFeatures/D3_Positions/Photo.cdf

mv h36m/S1/MyPoseFeatures/D3_Positions/TakingPhoto\ 1.cdf \
   h36m/S1/MyPoseFeatures/D3_Positions/Photo\ 1.cdf

mv h36m/S1/MyPoseFeatures/D3_Positions/WalkingDog.cdf \
   h36m/S1/MyPoseFeatures/D3_Positions/WalkDog.cdf

mv h36m/S1/MyPoseFeatures/D3_Positions/WalkingDog\ 1.cdf \
   h36m/S1/MyPoseFeatures/D3_Positions/WalkDog\ 1.cdf

And you are done!

Please note that we are currently not supporting SH detections anymore, only training from GT 2d detections is possible now.

Quick demo

For a quick demo, you can train for one epoch and visualize the results. To train, run

python src/predict_3dpose.py --camera_frame --residual --batch_norm --dropout 0.5 --max_norm --evaluateActionWise --epochs 1

This should take about <5 minutes to complete on a GTX 1080, and give you around 56 mm of error on the test set.

Now, to visualize the results, simply run

python src/predict_3dpose.py --camera_frame --residual --batch_norm --dropout 0.5 --max_norm --evaluateActionWise --epochs 1 --sample --load 24371

This will produce a visualization similar to this:

Training

To train a model with clean 2d detections, run:

python src/predict_3dpose.py --camera_frame --residual --batch_norm --dropout 0.5 --max_norm --evaluateActionWise

This corresponds to Table 2, bottom row. Ours (GT detections) (MA)

Citing

If you use our code, please cite our work

@inproceedings{martinez_2017_3dbaseline,
  title={A simple yet effective baseline for 3d human pose estimation},
  author={Martinez, Julieta and Hossain, Rayat and Romero, Javier and Little, James J.},
  booktitle={ICCV},
  year={2017}
}

Other implementations

Pytorch by @weigq
MXNet/Gluon by @lck1201

Extensions

@ArashHosseini maintains a fork for estimating 3d human poses using the 2d poses estimated by either OpenPose or tf-pose-estimation as input.

License

MIT

A simple baseline for 3d human pose estimation in tensorflow. Presented at ICCV 17.

Related tags

Overview

3d-pose-baseline

Dependencies

First of all

Quick demo

Training

Citing

Other implementations

Extensions

License

Owner

Julieta Martinez

Reference models and tools for Cloud TPUs.

MEND: Model Editing Networks using Gradient Decomposition

A-ESRGAN aims to provide better super-resolution images by using multi-scale attention U-net discriminators.

Code for "LoFTR: Detector-Free Local Feature Matching with Transformers", CVPR 2021

Official Implementation of DE-CondDETR and DELA-CondDETR in "Towards Data-Efficient Detection Transformers"

Public scripts, services, and configuration for running a smart home K3S network cluster

docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity

Official code for UnICORNN (ICML 2021)

PyTorch and Tensorflow functional model definitions

Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

Free course that takes you from zero to Reinforcement Learning PRO 🦸🏻‍🦸🏽

PFENet: Prior Guided Feature Enrichment Network for Few-shot Segmentation (TPAMI).

Code of the paper "Shaping Visual Representations with Attributes for Few-Shot Learning (ASL)".

How the Deep Q-learning method works and discuss the new ideas that makes the algorithm work

EMNLP 2021 - Frustratingly Simple Pretraining Alternatives to Masked Language Modeling

Code for CVPR2021 "Visualizing Adapted Knowledge in Domain Transfer". Visualization for domain adaptation. #explainable-ai

Capture all information throughout your model's development in a reproducible way and tie results directly to the model code!

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

Official pytorch code for "APP: Anytime Progressive Pruning"