A simple baseline for 3d human pose estimation in tensorflow. Presented at ICCV 17.

Overview

3d-pose-baseline

This is the code for the paper

Julieta Martinez, Rayat Hossain, Javier Romero, James J. Little. A simple yet effective baseline for 3d human pose estimation. In ICCV, 2017. https://arxiv.org/pdf/1705.03098.pdf.

The code in this repository was mostly written by Julieta Martinez, Rayat Hossain and Javier Romero.

We provide a strong baseline for 3d human pose estimation that also sheds light on the challenges of current approaches. Our model is lightweight and we strive to make our code transparent, compact, and easy-to-understand.

Dependencies

First of all

  1. Watch our video: https://youtu.be/Hmi3Pd9x1BE

  2. Clone this repository

git clone https://github.com/una-dinosauria/3d-pose-baseline.git
cd 3d-pose-baseline
mkdir -p data/h36m/
  1. Get the data

Go to http://vision.imar.ro/human3.6m/, log in, and download the D3 Positions files for subjects [1, 5, 6, 7, 8, 9, 11], and put them under the folder data/h36m. Your directory structure should look like this

src/
README.md
LICENCE
...
data/
  └── h36m/
    ├── Poses_D3_Positions_S1.tgz
    ├── Poses_D3_Positions_S11.tgz
    ├── Poses_D3_Positions_S5.tgz
    ├── Poses_D3_Positions_S6.tgz
    ├── Poses_D3_Positions_S7.tgz
    ├── Poses_D3_Positions_S8.tgz
    └── Poses_D3_Positions_S9.tgz

Now, move to the data folder, and uncompress all the data

cd data/h36m/
for file in *.tgz; do tar -xvzf $file; done

Finally, download the code-v1.2.zip file, unzip it, and copy the metadata.xml file under data/h36m/

Now, your data directory should look like this:

data/
  └── h36m/
    ├── metadata.xml
    ├── S1/
    ├── S11/
    ├── S5/
    ├── S6/
    ├── S7/
    ├── S8/
    └── S9/

There is one little fix we need to run for the data to have consistent names:

mv h36m/S1/MyPoseFeatures/D3_Positions/TakingPhoto.cdf \
   h36m/S1/MyPoseFeatures/D3_Positions/Photo.cdf

mv h36m/S1/MyPoseFeatures/D3_Positions/TakingPhoto\ 1.cdf \
   h36m/S1/MyPoseFeatures/D3_Positions/Photo\ 1.cdf

mv h36m/S1/MyPoseFeatures/D3_Positions/WalkingDog.cdf \
   h36m/S1/MyPoseFeatures/D3_Positions/WalkDog.cdf

mv h36m/S1/MyPoseFeatures/D3_Positions/WalkingDog\ 1.cdf \
   h36m/S1/MyPoseFeatures/D3_Positions/WalkDog\ 1.cdf

And you are done!

Please note that we are currently not supporting SH detections anymore, only training from GT 2d detections is possible now.

Quick demo

For a quick demo, you can train for one epoch and visualize the results. To train, run

python src/predict_3dpose.py --camera_frame --residual --batch_norm --dropout 0.5 --max_norm --evaluateActionWise --epochs 1

This should take about <5 minutes to complete on a GTX 1080, and give you around 56 mm of error on the test set.

Now, to visualize the results, simply run

python src/predict_3dpose.py --camera_frame --residual --batch_norm --dropout 0.5 --max_norm --evaluateActionWise --epochs 1 --sample --load 24371

This will produce a visualization similar to this:

Visualization example

Training

To train a model with clean 2d detections, run:

python src/predict_3dpose.py --camera_frame --residual --batch_norm --dropout 0.5 --max_norm --evaluateActionWise

This corresponds to Table 2, bottom row. Ours (GT detections) (MA)

Citing

If you use our code, please cite our work

@inproceedings{martinez_2017_3dbaseline,
  title={A simple yet effective baseline for 3d human pose estimation},
  author={Martinez, Julieta and Hossain, Rayat and Romero, Javier and Little, James J.},
  booktitle={ICCV},
  year={2017}
}

Other implementations

Extensions

License

MIT

Owner
Julieta Martinez
Not affiliated with the University of Toronto
Julieta Martinez
Spatial Temporal Graph Convolutional Networks (ST-GCN) for Skeleton-Based Action Recognition in PyTorch

Reminder ST-GCN has transferred to MMSkeleton, and keep on developing as an flexible open source toolbox for skeleton-based human understanding. You a

sijie yan 1.1k Dec 25, 2022
DrNAS: Dirichlet Neural Architecture Search

This paper proposes a novel differentiable architecture search method by formulating it into a distribution learning problem. We treat the continuously relaxed architecture mixing weight as random va

Xiangning Chen 37 Jan 03, 2023
Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

Human Pose Regression with Residual Log-likelihood Estimation [Paper] [arXiv] [Project Page] Human Pose Regression with Residual Log-likelihood Estima

JeffLi 347 Dec 24, 2022
Learning Efficient Online 3D Bin Packing on Packing Configuration Trees

Learning Efficient Online 3D Bin Packing on Packing Configuration Trees This repository is being continuously updated, please stay tuned! Any code con

86 Dec 28, 2022
Pytorch Implementation of rpautrat/SuperPoint

SuperPoint-Pytorch (A Pure Pytorch Implementation) SuperPoint: Self-Supervised Interest Point Detection and Description Thanks This work is based on:

76 Dec 27, 2022
Structural Constraints on Information Content in Human Brain States

Structural Constraints on Information Content in Human Brain States Code accompanying the paper "The information content of brain states is explained

Leon Weninger 3 Sep 07, 2022
Hand Gesture Volume Control | Open CV | Computer Vision

Gesture Volume Control Hand Gesture Volume Control | Open CV | Computer Vision Use gesture control to change the volume of a computer. First we look i

Jhenil Parihar 3 Jun 15, 2022
Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

2D-TAN (Optimized) Introduction This is an optimized re-implementation repository for AAAI'2020 paper: Learning 2D Temporal Localization Networks for

Joya Chen 112 Dec 31, 2022
Code for You Only Cut Once: Boosting Data Augmentation with a Single Cut

You Only Cut Once (YOCO) YOCO is a simple method/strategy of performing augmenta

88 Dec 28, 2022
git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li Accepted by CVPR

NingWang 236 Dec 22, 2022
Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT

CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT CheXbert is an accurate, automated dee

Stanford Machine Learning Group 51 Dec 08, 2022
This repository attempts to replicate the SqueezeNet architecture and implement the same on an image classification task.

SqueezeNet-Implementation This repository attempts to replicate the SqueezeNet architecture using TensorFlow discussed in the research paper: "Squeeze

Rohan Mathur 3 Dec 13, 2022
Efficient 3D Backbone Network for Temporal Modeling

VoV3D is an efficient and effective 3D backbone network for temporal modeling implemented on top of PySlowFast. Diverse Temporal Aggregation and

102 Dec 06, 2022
Transfer Learning Shootout for PyTorch's model zoo (torchvision)

pytorch-retraining Transfer Learning shootout for PyTorch's model zoo (torchvision). Load any pretrained model with custom final layer (num_classes) f

Alexander Hirner 169 Jun 29, 2022
RRL: Resnet as representation for Reinforcement Learning

Resnet as representation for Reinforcement Learning (RRL) is a simple yet effective approach for training behaviors directly from visual inputs. We demonstrate that features learned by standard image

Meta Research 21 Dec 07, 2022
Code for KDD'20 "An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph"

Heterogeneous INteract and aggreGatE (GraphHINGE) This is a pytorch implementation of GraphHINGE model. This is the experiment code in the following w

Jinjiarui 69 Nov 24, 2022
On Evaluation Metrics for Graph Generative Models

On Evaluation Metrics for Graph Generative Models Authors: Rylee Thompson, Boris Knyazev, Elahe Ghalebi, Jungtaek Kim, Graham Taylor This is the offic

13 Jan 07, 2023
Serve TensorFlow ML models with TF-Serving and then create a Streamlit UI to use them

TensorFlow Serving + Streamlit! ✨ 🖼️ Serve TensorFlow ML models with TF-Serving and then create a Streamlit UI to use them! This is a pretty simple S

Álvaro Bartolomé 18 Jan 07, 2023
Baseline of DCASE 2020 task 4

Couple Learning for SED This repository provides the data and source code for sound event detection (SED) task. The improvement of the Couple Learning

21 Oct 18, 2022
BERTMap: A BERT-Based Ontology Alignment System

BERTMap: A BERT-based Ontology Alignment System Important Notices The relevant paper was accepted in AAAI-2022. Arxiv version is available at: https:/

KRR 36 Dec 24, 2022