KAPAO is an efficient multi-person human pose estimation model that detects keypoints and poses as objects and fuses the detections to predict human poses.

Overview

KAPAO (Keypoints and Poses as Objects)

KAPAO is an efficient single-stage multi-person human pose estimation model that models keypoints and poses as objects within a dense anchor-based detection framework. When not using test-time augmentation (TTA), KAPAO is much faster and more accurate than previous single-stage methods like DEKR and HigherHRNet:

alt text

This repository contains the official PyTorch implementation for the paper:
Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation.

Our code was forked from ultralytics/yolov5 at commit 5487451.

Setup

  1. If you haven't already, install Anaconda or Miniconda.
  2. Create a new conda environment with Python 3.6: $ conda create -n kapao python=3.6.
  3. Activate the environment: $ conda activate kapao
  4. Clone this repo: $ git clone https://github.com/wmcnally/kapao.git
  5. Install the dependencies: $ cd kapao && pip install -r requirements.txt
  6. Download the trained models: $ sh data/scripts/download_models.sh

Inference Demos

Note: FPS calculations includes all processing, including inference, plotting / tracking, image resizing, etc. See demo script arguments for inference options.

Flash Mob Demo

This demo runs inference on a 720p dance video (native frame-rate of 25 FPS).

alt text

To display the inference results in real-time:
$ python demos/flash_mob.py --weights kapao_s_coco.pt --display --fps

To create the GIF above:
$ python demos/flash_mob.py --weights kapao_s_coco.pt --start 188 --end 196 --gif --fps

Squash Demo

This demo runs inference on a 1080p slow motion squash video (native frame-rate of 25 FPS). It uses a simple player tracking algorithm based on the frame-to-frame pose differences.

alt text

To display the inference results in real-time:
$ python demos/squash.py --weights kapao_s_coco.pt --display --fps

To create the GIF above:
$ python demos/squash.py --weights kapao_s_coco.pt --start 42 --end 50 --gif --fps

COCO Experiments

Download the COCO dataset: $ sh data/scripts/get_coco_kp.sh

Validation (without TTA)

  • KAPAO-S (63.0 AP): $ python val.py --rect
  • KAPAO-M (68.5 AP): $ python val.py --rect --weights kapao_m_coco.pt
  • KAPAO-L (70.6 AP): $ python val.py --rect --weights kapao_l_coco.pt

Validation (with TTA)

  • KAPAO-S (64.3 AP): $ python val.py --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-M (69.6 AP): $ python val.py --weights kapao_m_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-L (71.6 AP): $ python val.py --weights kapao_l_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1

Testing

  • KAPAO-S (63.8 AP): $ python val.py --scales 0.8 1 1.2 --flips -1 3 -1 --task test
  • KAPAO-M (68.8 AP): $ python val.py --weights kapao_m_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1 --task test
  • KAPAO-L (70.3 AP): $ python val.py --weights kapao_l_coco.pt \
    --scales 0.8 1 1.2 --flips -1 3 -1 --task test

Training

The following commands were used to train the KAPAO models on 4 V100s with 32GB memory each.

KAPAO-S:

python -m torch.distributed.launch --nproc_per_node 4 train.py \
--img 1280 \
--batch 128 \
--epochs 500 \
--data data/coco-kp.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5s6.pt \
--project runs/s_e500 \
--name train \
--workers 128

KAPAO-M:

python train.py \
--img 1280 \
--batch 72 \
--epochs 500 \
--data data/coco-kp.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5m6.pt \
--project runs/m_e500 \
--name train \
--workers 128

KAPAO-L:

python train.py \
--img 1280 \
--batch 48 \
--epochs 500 \
--data data/coco-kp.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5l6.pt \
--project runs/l_e500 \
--name train \
--workers 128

Note: DDP is usually recommended but we found training was less stable for KAPAO-M/L using DDP. We are investigating this issue.

CrowdPose Experiments

  • Install the CrowdPose API to your conda environment:
    $ cd .. && git clone https://github.com/Jeff-sjtu/CrowdPose.git
    $ cd CrowdPose/crowdpose-api/PythonAPI && sh install.sh && cd ../../../kapao
  • Download the CrowdPose dataset: $ sh data/scripts/get_crowdpose.sh

Testing

  • KAPAO-S (63.8 AP): $ python val.py --data crowdpose.yaml \
    --weights kapao_s_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-M (67.1 AP): $ python val.py --data crowdpose.yaml \
    --weights kapao_m_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1
  • KAPAO-L (68.9 AP): $ python val.py --data crowdpose.yaml \
    --weights kapao_l_crowdpose.pt --scales 0.8 1 1.2 --flips -1 3 -1

Training

The following commands were used to train the KAPAO models on 4 V100s with 32GB memory each. Training was performed on the trainval split with no validation. The test results above were generated using the last model checkpoint.

KAPAO-S:

python -m torch.distributed.launch --nproc_per_node 4 train.py \
--img 1280 \
--batch 128 \
--epochs 300 \
--data data/crowdpose.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5s6.pt \
--project runs/cp_s_e300 \
--name train \
--workers 128 \
--noval

KAPAO-M:

python train.py \
--img 1280 \
--batch 72 \
--epochs 300 \
--data data/coco-kp.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5m6.pt \
--project runs/cp_m_e300 \
--name train \
--workers 128 \
--noval

KAPAO-L:

python train.py \
--img 1280 \
--batch 48 \
--epochs 300 \
--data data/crowdpose.yaml \
--hyp data/hyps/hyp.kp-p6.yaml \
--val-scales 1 \
--val-flips -1 \
--weights yolov5l6.pt \
--project runs/cp_l_e300 \
--name train \
--workers 128 \
--noval

Acknowledgements

This work was supported in part by Compute Canada, the Canada Research Chairs Program, the Natural Sciences and Engineering Research Council of Canada, a Microsoft Azure Grant, and an NVIDIA Hardware Grant.

If you find this repo is helpful in your research, please cite our paper:

@article{mcnally2021kapao,
  title={Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation},
  author={McNally, William and Vats, Kanav and Wong, Alexander and McPhee, John},
  journal={arXiv preprint arXiv:2111.08557},
  year={2021}
}

Please also consider citing our previous works:

@inproceedings{mcnally2021deepdarts,
  title={DeepDarts: Modeling Keypoints as Objects for Automatic Scorekeeping in Darts using a Single Camera},
  author={McNally, William and Walters, Pascale and Vats, Kanav and Wong, Alexander and McPhee, John},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={4547--4556},
  year={2021}
}

@article{mcnally2021evopose2d,
  title={EvoPose2D: Pushing the Boundaries of 2D Human Pose Estimation Using Accelerated Neuroevolution With Weight Transfer},
  author={McNally, William and Vats, Kanav and Wong, Alexander and McPhee, John},
  journal={IEEE Access},
  volume={9},
  pages={139403--139414},
  year={2021},
  publisher={IEEE}
}
Owner
Will McNally
PhD Candidate
Will McNally
NOMAD - A blackbox optimization software

################################################################################### #

Blackbox Optimization 78 Dec 29, 2022
The code repository for "PyCIL: A Python Toolbox for Class-Incremental Learning" in PyTorch.

PyCIL: A Python Toolbox for Class-Incremental Learning Introduction • Methods Reproduced • Reproduced Results • How To Use • License • Acknowledgement

Fu-Yun Wang 258 Dec 31, 2022
Official code base for the poster "On the use of Cortical Magnification and Saccades as Biological Proxies for Data Augmentation" published in NeurIPS 2021 Workshop (SVRHM)

Self-Supervised Learning (SimCLR) with Biological Plausible Image Augmentations Official code base for the poster "On the use of Cortical Magnificatio

Binxu 8 Aug 17, 2022
Using fully convolutional networks for semantic segmentation with caffe for the cityscapes dataset

Using fully convolutional networks for semantic segmentation (Shelhamer et al.) with caffe for the cityscapes dataset How to get started Download the

Simon Guist 27 Jun 06, 2022
Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

30 Days Of Machine Learning Using Pytorch Objective of the repository is to learn and build machine learning models using Pytorch. List of Algorithms

Mayur 119 Nov 24, 2022
Code for "Searching for Efficient Multi-Stage Vision Transformers"

Searching for Efficient Multi-Stage Vision Transformers This repository contains the official Pytorch implementation of "Searching for Efficient Multi

Yi-Lun Liao 62 Oct 25, 2022
TensorFlow-based implementation of "Pyramid Scene Parsing Network".

PSPNet_tensorflow Important Code is fine for inference. However, the training code is just for reference and might be only used for fine-tuning. If yo

HsuanKung Yang 323 Dec 20, 2022
Learning Representational Invariances for Data-Efficient Action Recognition

Learning Representational Invariances for Data-Efficient Action Recognition Official PyTorch implementation for Learning Representational Invariances

Virginia Tech Vision and Learning Lab 27 Nov 22, 2022
Integrated physics-based and ligand-based modeling.

ComBind ComBind integrates data-driven modeling and physics-based docking for improved binding pose prediction and binding affinity prediction. Given

Dror Lab 44 Oct 26, 2022
An open-source Deep Learning Engine for Healthcare that aims to treat & prevent major diseases

AlphaCare Background AlphaCare is a work-in-progress, open-source Deep Learning Engine for Healthcare that aims to treat and prevent major diseases. T

Siraj Raval 44 Nov 05, 2022
Multi Task RL Baselines

MTRL Multi Task RL Algorithms Contents Introduction Setup Usage Documentation Contributing to MTRL Community Acknowledgements Introduction M

Facebook Research 171 Jan 09, 2023
Code for "Continuous-Time Meta-Learning with Forward Mode Differentiation" (ICLR 2022)

Continuous-Time Meta-Learning with Forward Mode Differentiation ICLR 2022 (Spotlight) - Installation - Example - Citation This repository contains the

Tristan Deleu 25 Oct 20, 2022
An Official Repo of CVPR '20 "MSeg: A Composite Dataset for Multi-Domain Segmentation"

This is the code for the paper: MSeg: A Composite Dataset for Multi-domain Semantic Segmentation (CVPR 2020, Official Repo) [CVPR PDF] [Journal PDF] J

226 Nov 05, 2022
PyTorch Personal Trainer: My framework for deep learning experiments

Alex's PyTorch Personal Trainer (ptpt) (name subject to change) This repository contains my personal lightweight framework for deep learning projects

Alex McKinney 8 Jul 14, 2022
Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

Tengfei Wang 371 Dec 30, 2022
PyTorch implementation of popular datasets and models in remote sensing

PyTorch Remote Sensing (torchrs) (WIP) PyTorch implementation of popular datasets and models in remote sensing tasks (Change Detection, Image Super Re

isaac 222 Dec 28, 2022
Automatic tool focused on deriving metallicities of open clusters

metalcode Automatic tool focused on deriving metallicities of open clusters. Based on the method described in Pöhnl & Paunzen (2010, https://ui.adsabs

2 Dec 13, 2021
PyTorch implementation of our paper: Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition, arxiv This is a PyTorch implementation of our paper. 1. Re

DamoCV 11 Nov 19, 2022
Implementation of Hierarchical Transformer Memory (HTM) for Pytorch

Hierarchical Transformer Memory (HTM) - Pytorch Implementation of Hierarchical Transformer Memory (HTM) for Pytorch. This Deepmind paper proposes a si

Phil Wang 63 Dec 29, 2022
Self-Supervised Image Denoising via Iterative Data Refinement

Self-Supervised Image Denoising via Iterative Data Refinement Yi Zhang1, Dasong Li1, Ka Lung Law2, Xiaogang Wang1, Hongwei Qin2, Hongsheng Li1 1CUHK-S

Zhang Yi 72 Jan 01, 2023