This is an official implementation of our CVPR 2021 paper "Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression" (https://arxiv.org/abs/2104.02300)

Related tags

Deep LearningDEKR
Overview

Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression

Introduction

In this paper, we are interested in the bottom-up paradigm of estimating human poses from an image. We study the dense keypoint regression framework that is previously inferior to the keypoint detection and grouping framework. Our motivation is that regressing keypoint positions accurately needs to learn representations that focus on the keypoint regions.

We present a simple yet effective approach, named disentangled keypoint regression (DEKR). We adopt adaptive convolutions through pixel-wise spatial transformer to activate the pixels in the keypoint regions and accordingly learn representations from them. We use a multi-branch structure for separate regression: each branch learns a representation with dedicated adaptive convolutions and regresses one keypoint. The resulting disentangled representations are able to attend to the keypoint regions, respectively, and thus the keypoint regression is spatially more accurate. We empirically show that the proposed direct regression method outperforms keypoint detection and grouping methods and achieves superior bottom-up pose estimation results on two benchmark datasets, COCO and CrowdPose.

Main Results

Results on COCO val2017 without multi-scale test

Backbone Input size #Params GFLOPs AP AP .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L)
pose_hrnet_w32 512x512 29.6M 45.4 0.680 0.867 0.745 0.621 0.777 0.730 0.898 0.784 0.662 0.827
pose_hrnet_w48 640x640 65.7M 141.5 0.710 0.883 0.774 0.667 0.785 0.760 0.914 0.815 0.706 0.840

Results on COCO val2017 with multi-scale test

Backbone Input size #Params GFLOPs AP AP .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L)
pose_hrnet_w32 512x512 29.6M 45.4 0.707 0.877 0.771 0.662 0.778 0.759 0.913 0.813 0.705 0.836
pose_hrnet_w48 640x640 65.7M 141.5 0.723 0.883 0.786 0.686 0.786 0.777 0.924 0.832 0.728 0.849

Results on COCO test-dev2017 without multi-scale test

Backbone Input size #Params GFLOPs AP AP .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L)
pose_hrnet_w32 512x512 29.6M 45.4 0.673 0.879 0.741 0.615 0.761 0.724 0.908 0.782 0.654 0.819
pose_hrnet_w48 640x640 65.7M 141.5 0.700 0.894 0.773 0.657 0.769 0.754 0.927 0.816 0.697 0.832

Results on COCO test-dev2017 with multi-scale test

Backbone Input size #Params GFLOPs AP AP .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L)
pose_hrnet_w32 512x512 29.6M 45.4 0.698 0.890 0.766 0.652 0.765 0.751 0.924 0.811 0.695 0.828
pose_hrnet_w48 640x640 65.7M 141.5 0.710 0.892 0.780 0.671 0.769 0.767 0.932 0.830 0.715 0.839

Results on CrowdPose test without multi-scale test

Method AP AP .5 AP .75 AP (E) AP (M) AP (H)
pose_hrnet_w32 0.657 0.857 0.704 0.730 0.664 0.575
pose_hrnet_w48 0.673 0.864 0.722 0.746 0.681 0.587

Results on CrowdPose test with multi-scale test

Method AP AP .5 AP .75 AP (E) AP (M) AP (H)
pose_hrnet_w32 0.670 0.854 0.724 0.755 0.680 0.569
pose_hrnet_w48 0.680 0.855 0.734 0.766 0.688 0.584

Results with matching regression results to the closest keypoints detected from the keypoint heatmaps

DEKR-w32-SS DEKR-w32-MS DEKR-w48-SS DEKR-w48-MS
coco_val2017 0.680 0.710 0.710 0.728
coco_test-dev2017 0.673 0.702 0.701 0.714
crowdpose_test 0.655 0.675 0.670 0.683

Note:

  • Flip test is used.
  • GFLOPs is for convolution and linear layers only.

Environment

The code is developed using python 3.6 on Ubuntu 16.04. NVIDIA GPUs are needed. The code is developed and tested using 4 NVIDIA V100 GPU cards for HRNet-w32 and 8 NVIDIA V100 GPU cards for HRNet-w48. Other platforms are not fully tested.

Quick start

Installation

  1. Clone this repo, and we'll call the directory that you cloned as ${POSE_ROOT}.

  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Install COCOAPI:

    # COCOAPI=/path/to/clone/cocoapi
    git clone https://github.com/cocodataset/cocoapi.git $COCOAPI
    cd $COCOAPI/PythonAPI
    # Install into global site-packages
    make install
    # Alternatively, if you do not have permissions or prefer
    # not to install the COCO API into global site-packages
    python3 setup.py install --user
    

    Note that instructions like # COCOAPI=/path/to/install/cocoapi indicate that you should pick a path where you'd like to have the software cloned and then set an environment variable (COCOAPI in this case) accordingly.

  4. Install CrowdPoseAPI exactly the same as COCOAPI.

  5. Init output(training model output directory) and log(tensorboard log directory) directory:

    mkdir output 
    mkdir log
    

    Your directory tree should look like this:

    ${POSE_ROOT}
    ├── data
    ├── model
    ├── experiments
    ├── lib
    ├── tools 
    ├── log
    ├── output
    ├── README.md
    ├── requirements.txt
    └── setup.py
    
  6. Download pretrained models and our well-trained models from zoo(OneDrive) and make models directory look like this:

    ${POSE_ROOT}
    |-- model
    `-- |-- imagenet
        |   |-- hrnet_w32-36af842e.pth
        |   `-- hrnetv2_w48_imagenet_pretrained.pth
        |-- pose_coco
        |   |-- pose_dekr_hrnetw32_coco.pth
        |   `-- pose_dekr_hrnetw48_coco.pth
        |-- pose_crowdpose
        |   |-- pose_dekr_hrnetw32_crowdpose.pth
        |   `-- pose_dekr_hrnetw48_crowdpose.pth
        `-- rescore
            |-- final_rescore_coco_kpt.pth
            `-- final_rescore_crowd_pose_kpt.pth
    

Data preparation

For COCO data, please download from COCO download, 2017 Train/Val is needed for COCO keypoints training and validation. Download and extract them under {POSE_ROOT}/data, and make them look like this:

${POSE_ROOT}
|-- data
`-- |-- coco
    `-- |-- annotations
        |   |-- person_keypoints_train2017.json
        |   `-- person_keypoints_val2017.json
        `-- images
            |-- train2017.zip
            `-- val2017.zip

For CrowdPose data, please download from CrowdPose download, Train/Val is needed for CrowdPose keypoints training. Download and extract them under {POSE_ROOT}/data, and make them look like this:

${POSE_ROOT}
|-- data
`-- |-- crowdpose
    `-- |-- json
        |   |-- crowdpose_train.json
        |   |-- crowdpose_val.json
        |   |-- crowdpose_trainval.json (generated by tools/crowdpose_concat_train_val.py)
        |   `-- crowdpose_test.json
        `-- images.zip

After downloading data, run python tools/crowdpose_concat_train_val.py under ${POSE_ROOT} to create trainval set.

Training and Testing

Testing on COCO val2017 dataset without multi-scale test using well-trained pose model

python tools/valid.py \
    --cfg experiments/coco/w32/w32_4x_reg03_bs10_512_adam_lr1e-3_coco_x140.yaml \
    TEST.MODEL_FILE models/pose_coco/pose_dekr_hrnetw32_coco.pth

Testing on COCO test-dev2017 dataset without multi-scale test using well-trained pose model

python tools/valid.py \
    --cfg experiments/coco/w32/w32_4x_reg03_bs10_512_adam_lr1e-3_coco_x140.yaml \
    TEST.MODEL_FILE models/pose_coco/pose_dekr_hrnetw32_coco.pth \ 
    DATASET.TEST test-dev2017

Testing on COCO val2017 dataset with multi-scale test using well-trained pose model

python tools/valid.py \
    --cfg experiments/coco/w32/w32_4x_reg03_bs10_512_adam_lr1e-3_coco_x140.yaml \
    TEST.MODEL_FILE models/pose_coco/pose_dekr_hrnetw32_coco.pth \ 
    TEST.NMS_THRE 0.15 \
    TEST.SCALE_FACTOR 0.5,1,2

Testing on COCO val2017 dataset with matching regression results to the closest keypoints detected from the keypoint heatmaps

python tools/valid.py \
    --cfg experiments/coco/w32/w32_4x_reg03_bs10_512_adam_lr1e-3_coco_x140.yaml \
    TEST.MODEL_FILE models/pose_coco/pose_dekr_hrnetw32_coco.pth \ 
    TEST.MATCH_HMP True

Testing on crowdpose test dataset without multi-scale test using well-trained pose model

python tools/valid.py \
    --cfg experiments/crowdpose/w32/w32_4x_reg03_bs10_512_adam_lr1e-3_crowdpose_x300.yaml \
    TEST.MODEL_FILE models/pose_crowdpose/pose_dekr_hrnetw32_crowdpose.pth

Testing on crowdpose test dataset with multi-scale test using well-trained pose model

python tools/valid.py \
    --cfg experiments/crowdpose/w32/w32_4x_reg03_bs10_512_adam_lr1e-3_crowdpose_x300.yaml \
    TEST.MODEL_FILE models/pose_crowdpose/pose_dekr_hrnetw32_crowdpose.pth \ 
    TEST.NMS_THRE 0.15 \
    TEST.SCALE_FACTOR 0.5,1,2

Testing on crowdpose test dataset with matching regression results to the closest keypoints detected from the keypoint heatmaps

python tools/valid.py \
    --cfg experiments/crowdpose/w32/w32_4x_reg03_bs10_512_adam_lr1e-3_crowdpose_x300.yaml \
    TEST.MODEL_FILE models/pose_crowdpose/pose_dekr_hrnetw32_crowdpose.pth \ 
    TEST.MATCH_HMP True

Training on COCO train2017 dataset

python tools/train.py \
    --cfg experiments/coco/w32/w32_4x_reg03_bs10_512_adam_lr1e-3_coco_x140.yaml \

Training on Crowdpose trainval dataset

python tools/train.py \
    --cfg experiments/crowdpose/w32/w32_4x_reg03_bs10_512_adam_lr1e-3_crowdpose_x300.yaml \

Using inference demo

python tools/inference_demo.py --cfg experiments/coco/inference_demo_coco.yaml \
    --videoFile ../multi_people.mp4 \
    --outputDir output \
    --visthre 0.3 \
    TEST.MODEL_FILE model/pose_coco/pose_dekr_hrnetw32.pth
python tools/inference_demo.py --cfg experiments/crowdpose/inference_demo_crowdpose.yaml \
    --videoFile ../multi_people.mp4 \
    --outputDir output \
    --visthre 0.3 \
    TEST.MODEL_FILE model/pose_crowdpose/pose_dekr_hrnetw32.pth \

The above command will create a video under output directory and a lot of pose image under output/pose directory.

Scoring net

We use a scoring net, consisting of two fully-connected layers (each followed by a ReLU layer), and a linear prediction layer which aims to learn the OKS score for the corresponding predicted pose. For this scoring net, you can directly use our well-trained model in the model/rescore folder. You can also train your scoring net using your pose estimation model by the following steps:

  1. Generate scoring dataset on train dataset:
python tools/valid.py \
    --cfg experiments/coco/rescore_coco.yaml \
    TEST.MODEL_FILE model/pose_coco/pose_dekr_hrnetw32.pth
python tools/valid.py \
    --cfg experiments/crowdpose/rescore_crowdpose.yaml \
    TEST.MODEL_FILE model/pose_crowdpose/pose_dekr_hrnetw32.pth \
  1. Train the scoring net using the scoring dataset:
python tools/train_scorenet.py \
    --cfg experiment/coco/rescore_coco.yaml
python tools/train_scorenet.py \
    --cfg experiments/crowdpose/rescore_crowdpose.yaml \
  1. Using the well-trained scoring net to improve the performance of your pose estimation model (above 0.6AP).
python tools/valid.py \
    --cfg experiments/coco/w32/w32_4x_reg03_bs10_512_adam_lr1e-3_coco_x140.yaml \
    TEST.MODEL_FILE models/pose_coco/pose_dekr_hrnetw32_coco.pth
python tools/valid.py \
    --cfg experiments/crowdpose/w32/w32_4x_reg03_bs10_512_adam_lr1e-3_crowdpose_x300.yaml \
    TEST.MODEL_FILE models/pose_crowdpose/pose_dekr_hrnetw32_crowdpose.pth \

Acknowledge

Our code is mainly based on HigherHRNet.

Citation

@inproceedings{GengSXZW21,
  title={Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression},
  author={Zigang Geng, Ke Sun, Bin Xiao, Zhaoxiang Zhang, Jingdong Wang},
  booktitle={CVPR},
  year={2021}
}

@inproceedings{SunXLW19,
  title={Deep High-Resolution Representation Learning for Human Pose Estimation},
  author={Ke Sun and Bin Xiao and Dong Liu and Jingdong Wang},
  booktitle={CVPR},
  year={2019}
}

@article{WangSCJDZLMTWLX19,
  title={Deep High-Resolution Representation Learning for Visual Recognition},
  author={Jingdong Wang and Ke Sun and Tianheng Cheng and 
          Borui Jiang and Chaorui Deng and Yang Zhao and Dong Liu and Yadong Mu and 
          Mingkui Tan and Xinggang Wang and Wenyu Liu and Bin Xiao},
  journal={TPAMI}
  year={2019}
}
Owner
HRNet
Code for pose estimation is available at https://github.com/leoxiaobin/deep-high-resolution-net.pytorch
HRNet
The implementation of DeBERTa

DeBERTa: Decoding-enhanced BERT with Disentangled Attention This repository is the official implementation of DeBERTa: Decoding-enhanced BERT with Dis

Microsoft 1.2k Jan 06, 2023
Numbering permanent and deciduous teeth via deep instance segmentation in panoramic X-rays

Numbering permanent and deciduous teeth via deep instance segmentation in panoramic X-rays In this repo, you will find the instructions on how to requ

Intelligent Vision Research Lab 4 Jul 21, 2022
Решения, подсказки, тесты и утилиты для тренировки по алгоритмам от Яндекса.

Решения и подсказки к тренировке по алгоритмам от Яндекса Что есть внутри Решения с подсказками и комментариями; рекомендую сначала смотреть md файл п

Yankovsky Andrey 50 Dec 26, 2022
A Partition Filter Network for Joint Entity and Relation Extraction EMNLP 2021

EMNLP 2021 - A Partition Filter Network for Joint Entity and Relation Extraction

zhy 127 Jan 04, 2023
Collection of generative models in Pytorch version.

pytorch-generative-model-collections Original : [Tensorflow version] Pytorch implementation of various GANs. This repository was re-implemented with r

Hyeonwoo Kang 2.4k Dec 31, 2022
Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at [email protected]

TableParser Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at DS3 Lab 11 Dec 13, 2022

A Pytorch reproduction of Range Loss, which is proposed in paper 《Range Loss for Deep Face Recognition with Long-Tailed Training Data》

RangeLoss Pytorch This is a Pytorch reproduction of Range Loss, which is proposed in paper 《Range Loss for Deep Face Recognition with Long-Tailed Trai

Youzhi Gu 7 Nov 27, 2021
Source code of NeurIPS 2021 Paper ''Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration''

CaGCN This repo is for source code of NeurIPS 2021 paper "Be Confident! Towards Trustworthy Graph Neural Networks via Confidence Calibration". Paper L

6 Dec 19, 2022
PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in clustering (CVPR2021)

PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering Jang Hyun Cho1, Utkarsh Mall2, Kavita Bala2, Bharath Harihar

Jang Hyun Cho 164 Dec 30, 2022
Pytorch implementation of "A simple neural network module for relational reasoning" (Relational Networks)

Pytorch implementation of Relational Networks - A simple neural network module for relational reasoning Implemented & tested on Sort-of-CLEVR task. So

Kim Heecheol 800 Dec 05, 2022
Title: Heart-Failure-Classification

This Notebook is based off an open source dataset available on where I have created models to classify patients who can potentially witness heart failure on the basis of various parameters. The best

Akarsh Singh 2 Sep 13, 2022
Unicorn can be used for performance analyses of highly configurable systems with causal reasoning

Unicorn can be used for performance analyses of highly configurable systems with causal reasoning. Users or developers can query Unicorn for a performance task.

AISys Lab 27 Jan 05, 2023
🏅 Top 5% in 제2회 연구개발특구 인공지능 경진대회 AI SPARK 챌린지

AI_SPARK_CHALLENG_Object_Detection 제2회 연구개발특구 인공지능 경진대회 AI SPARK 챌린지 🏅 Top 5% in mAP(0.75) (443명 중 13등, mAP: 0.98116) 대회 설명 Edge 환경에서의 가축 Object Dete

3 Sep 19, 2022
A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥

Lightning-Hydra-Template A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥 Click on Use this template to initialize new re

Hyunsoo Cho 1 Dec 20, 2021
Source code of article "Towards Toxic and Narcotic Medication Detection with Rotated Object Detector"

Towards Toxic and Narcotic Medication Detection with Rotated Object Detector Introduction This is the source code of article: Towards Toxic and Narcot

Woody. Wang 3 Oct 29, 2022
Creating predictive checklists from data using integer programming.

Learning Optimal Predictive Checklists A Python package to learn simple predictive checklists from data subject to customizable constraints. For more

Healthy ML 5 Apr 19, 2022
Repository for "Exploring Sparsity in Image Super-Resolution for Efficient Inference", CVPR 2021

SMSR Reposity for "Exploring Sparsity in Image Super-Resolution for Efficient Inference" [arXiv] Highlights Locate and skip redundant computation in S

Longguang Wang 225 Dec 26, 2022
YOLOX + ROS(1, 2) object detection package

YOLOX + ROS(1, 2) object detection package

Ar-Ray 158 Dec 21, 2022
RANZCR-CLiP 7th Place Solution

RANZCR-CLiP 7th Place Solution This repository is WIP. (18 Mar 2021) Installation git clone https://github.com/analokmaus/kaggle-ranzcr-clip-public.gi

Hiroshechka Y 21 Oct 22, 2022
Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019

PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image" Introduction This repo is official Py

Gyeongsik Moon 677 Dec 25, 2022