Basics of 2D and 3D Human Pose Estimation.

Overview

Human Pose Estimation 101

If you want a slightly more rigorous tutorial and understand the basics of Human Pose Estimation and how the field has evolved, check out these articles I published on 2D Pose Estimation and 3D Pose Estimation

Table of Contents

Basics

  • Defined as the problem of localization of human joints (or) keypoints
  • A rigid body consists of joints and rigid parts. A body with strong articulation is a body with strong contortion.
  • Pose Estimation is the search for a specific pose in space of all articulated poses
  • Number of keypoints varies with dataset - LSP has 14, MPII has 16, 16 are used in Human3.6m
  • Classifed into 2D and 3D Pose Estimation
    • 2D Pose Estimation
    • Estimate a 2D pose (x,y) coordinates for each joint in pixel space from a RGB image
    • 3D Pose Estimation
    • Estimate a 3D pose (x,y,z) coordinates in metric space from a RGB image, or in previous works, data from a RGB-D sensor. (However, research in the past few years is heavily focussed on generating 3D poses from 2D images / 2D videos)

Loss

  • Most commonly used loss function - Mean Squared Error, MSE(Least Squares Loss)
  • This is a regression problem. The model will try to regress to the the correct coordinates, i.e move to the ground truth coordinatate’s in small increments. The model is trained to output continuous coordinates using a Mean Squared Error loss function

Evaluation metrics

Percentage of Correct Parts - PCP

  • A limb is considered detected and a correct part if the distance between the two predicted joint locations and the true limb joint locations is at most half of the limb length (PCP at 0.5 )
  • Measures detection rate of limbs
  • Cons - penalizes shorter limbs
  • Calculation
    • For a specific part, PCP = (No. of correct parts for entire dataset) / (No. of total parts for entire dataset)
    • Take a dataset with 10 images and 1 pose per image. Each pose has 8 parts - ( upper arm, lower arm, upper leg, lower leg ) x2
    • No of upper arms = 10 * 2 = 20
    • No of lower arms = 20
    • No of lower legs = No of upper legs = 20
    • If upper arm is detected correct for 17 out of the 20 upper arms i.e 17 ( 10 right arms and 7 left) → PCP = 17/20 = 85%
  • Higher the better

Percentage of Correct Key-points - PCK

  • Detected joint is considered correct if the distance between the predicted and the true joint is within a certain threshold (threshold varies)
  • [email protected] is when the threshold = 50% of the head bone link
  • [email protected] == Distance between predicted and true joint < 0.2 * torso diameter
  • Sometimes 150 mm is taken as the threshold
  • Head, shoulder, Elbow, Wrist, Hip, Knee, Ankle → Keypoints
  • PCK is used for 2D and 3D (PCK3D)
  • Higher the better

Percentage of Detected Joints - PDJ

  • Detected joint is considered correct if the distance between the predicted and the true joint is within a certain fraction of the torso diameter
  • Alleviates the shorter limb problem since shorter limbs have smaller torsos
  • PDJ at 0.2 → Distance between predicted and true join < 0.2 * torso diameter
  • Typically used for 2D Pose Estimation
  • Higher the better

Mean Per Joint Position Error - MPJPE

  • Per joint position error = Euclidean distance between ground truth and prediction for a joint
  • Mean per joint position error = Mean of per joint position error for all k joints (Typically, k = 16)
  • Calculated after aligning the root joints (typically the pelvis) of the estimated and groundtruth 3D pose.
  • PA MPJPE
    • Procrustes analysis MPJPE.
    • MPJPE calculated after the estimated 3D pose is aligned to the groundtruth by the Procrustes method
    • Procrustes method is simply a similarity transformation
  • Lower the better
  • Used for 3D Pose Estimation

AUC

Important Applications

  • Activity Analysis
  • Human-Computer Interaction (HCI)
  • Virtual Reality
  • Augmented Reality
  • Amazon Go presents an important domain for the application of Human Pose Estimation. Cameras track and recognize people and their actions, for which Pose Estimation is an important component. Entities relying on services that track and measure human activities rely heavily on human Pose Estimation

Informative roadmap on 2D Human Pose Estimation research

Owner
Sudharshan Chandra Babu
Machine Learning Engineer
Sudharshan Chandra Babu
Task Transformer Network for Joint MRI Reconstruction and Super-Resolution (MICCAI 2021)

T2Net Task Transformer Network for Joint MRI Reconstruction and Super-Resolution (MICCAI 2021) [Paper][Code] Dependencies numpy==1.18.5 scikit_image==

64 Nov 23, 2022
https://arxiv.org/abs/2102.11005

LogME LogME: Practical Assessment of Pre-trained Models for Transfer Learning How to use Just feed the features f and labels y to the function, and yo

THUML: Machine Learning Group @ THSS 149 Dec 19, 2022
Yoloxkeypointsegment - An anchor-free version of YOLO, with a simpler design but better performance

Introduction 关键点版本:已完成 全景分割版本:已完成 实例分割版本:已完成 YOLOX is an anchor-free version of

23 Oct 20, 2022
Learning from graph data using Keras

Steps to run = Download the cora dataset from this link : https://linqs.soe.ucsc.edu/data unzip the files in the folder input/cora cd code python eda

Mansar Youness 64 Nov 16, 2022
SCALoss: Side and Corner Aligned Loss for Bounding Box Regression (AAAI2022).

SCALoss PyTorch implementation of the paper "SCALoss: Side and Corner Aligned Loss for Bounding Box Regression" (AAAI 2022). Introduction IoU-based lo

TuZheng 20 Sep 07, 2022
A Python library for differentiable optimal control on accelerators.

A Python library for differentiable optimal control on accelerators.

Google 80 Dec 21, 2022
The official repository for BaMBNet

BaMBNet-Pytorch Paper

Junjun Jiang 18 Dec 04, 2022
A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

Visdom A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Python. Overview Concepts Setup Usage API To

FOSSASIA 9.4k Jan 07, 2023
ICNet and PSPNet-50 in Tensorflow for real-time semantic segmentation

Real-Time Semantic Segmentation in TensorFlow Perform pixel-wise semantic segmentation on high-resolution images in real-time with Image Cascade Netwo

Oles Andrienko 219 Nov 21, 2022
UAV-Networks-Routing is a Python simulator for experimenting routing algorithms and mac protocols on unmanned aerial vehicle networks.

UAV-Networks Simulator - Autonomous Networking - A.A. 20/21 UAV-Networks-Routing is a Python simulator for experimenting routing algorithms and mac pr

0 Nov 13, 2021
KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite

KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite

80 Dec 27, 2022
Official PyTorch implementation of GDWCT (CVPR 2019, oral)

This repository provides the official code of GDWCT, and it is written in PyTorch. Paper Image-to-Image Translation via Group-wise Deep Whitening-and-

WonwoongCho 135 Dec 02, 2022
ICLR 2021, Fair Mixup: Fairness via Interpolation

Fair Mixup: Fairness via Interpolation Training classifiers under fairness constraints such as group fairness, regularizes the disparities of predicti

Ching-Yao Chuang 49 Nov 22, 2022
Implementation of the Remixer Block from the Remixer paper, in Pytorch

Remixer - Pytorch Implementation of the Remixer Block from the Remixer paper, in Pytorch. It claims that substituting the feedforwards in transformers

Phil Wang 35 Aug 23, 2022
An implementation of the BADGE batch active learning algorithm.

Batch Active learning by Diverse Gradient Embeddings (BADGE) An implementation of the BADGE batch active learning algorithm. Details are provided in o

125 Dec 24, 2022
A PyTorch implementation of "ANEMONE: Graph Anomaly Detection with Multi-Scale Contrastive Learning", CIKM-21

ANEMONE A PyTorch implementation of "ANEMONE: Graph Anomaly Detection with Multi-Scale Contrastive Learning", CIKM-21 Dependencies python==3.6.1 dgl==

Graph Analysis & Deep Learning Laboratory, GRAND 30 Dec 14, 2022
Working demo of the Multi-class and Anomaly classification model using the CLIP feature space

👁️ Hindsight AI: Crime Classification With Clip About For Educational Purposes Only This is a recursive neural net trained to classify specific crime

Miles Tweed 2 Jun 05, 2022
Molecular Sets (MOSES): A benchmarking platform for molecular generation models

Molecular Sets (MOSES): A benchmarking platform for molecular generation models Deep generative models are rapidly becoming popular for the discovery

Neelesh C A 3 Oct 14, 2022
Supplementary code for the paper "Meta-Solver for Neural Ordinary Differential Equations" https://arxiv.org/abs/2103.08561

Meta-Solver for Neural Ordinary Differential Equations Towards robust neural ODEs using parametrized solvers. Main idea Each Runge-Kutta (RK) solver w

Julia Gusak 25 Aug 12, 2021
Semi-supervised semantic segmentation needs strong, varied perturbations

Semi-supervised semantic segmentation using CutMix and Colour Augmentation Implementations of our papers: Semi-supervised semantic segmentation needs

146 Dec 20, 2022