Source code release of the paper: Knowledge-Guided Deep Fractal Neural Networks for Human Pose Estimation.

Overview

GNet-pose

Project Page: http://guanghan.info/projects/guided-fractal/

UPDATE 9/27/2018:

Prototxts and model that achieved 93.9Pck on LSP dataset. http://guanghan.info/download/Data/GNet_update.zip

When I was replying e-mails, it occurred to me that the models that I had uploaded was around May/June 2017 (performance in old arxiv version), and in August 2017 the performance was improved to 93.9 on LSP with a newer caffe version which fixed the downsampling and/or upsampling deprecation problem (Yeah, it "magically" improved the performance). The best model was 94.0071 on LSP dataset, but it was not uploaded nor published on the benchmark.


Overview

Knowledge-Guided Deep Fractal Neural Networks for Human Pose Estimation.

Source code release of the paper for reproduction of experimental results, and to aid researchers in future research.


Prerequisites


Getting Started

1. Download Data and Pre-trained Models

  • Datasets (MPII [1], LSP [2])

    bash ./get_dataset.sh
    
  • Models

    bash ./get_models.sh
    
  • Predictions (optional)

    bash ./get_preds.sh
    

2. Testing

  • Generate cropped patches from the dataset for testing:

    cd testing/
    matlab gen_cropped_LSP_test_images.m
    matlab gen_cropped_MPII_test_images.m
    cd -
    

    This will generate images with 368-by-368 resolution.

  • Reproduce the results with the pre-trained model:

    cd testing/
    python .test.py
    cd -
    

    You can choose different dataset to test on, with different models. You can also choose different settings in test.py, e.g., with or without flipping, scaling, cross-heatmap regression, etc.

3. Training

  • Generate Annotations

    cd training/Annotations/
    matlab MPI.m LEEDS.m
    cd -
    

    This will generate annotations in json files.

  • Generate LMDB

    python ./training/Data/genLMDB.py
    

    This will load images from dataset and annotations from json files, and generate lmdb files for caffe training.

  • Generate Prototxt files (optional)

    python ./training/GNet/scripts/gen_GNet.py
    python ./training/GNet/scripts/gen_fractal.py
    python ./training/GNet/scripts/gen_hourglass.py
    
  • Training:

     bash ./training/train.sh
    

4. Performance Evaluation

cd testing/eval_LSP/; matlab test_evaluation_lsp.m; cd../

cd testing/eval_MPII/; matlab test_evaluation_mpii_test.m

5. Results

More Qualitative results can be found in the project page. Quantitative results please refer to the arxiv paper.


License

GNet-pose is released under the Apache License Version 2.0 (refer to the LICENSE file for details).


Citation

If you use the code and models, please cite the following paper: TMM 2017.

@article{ning2017knowledge, 
 author={G. Ning and Z. Zhang and Z. He}, 
     journal={IEEE Transactions on Multimedia}, 
     title={Knowledge-Guided Deep Fractal Neural Networks for Human Pose Estimation}, 
     year={2017}, 
     doi={10.1109/TMM.2017.2762010}, 
     ISSN={1520-9210}, }

Reference

[1] Andriluka M, Pishchulin L, Gehler P, et al. "2d human pose estimation: New benchmark and state of the art analysis." CVPR (2014).

[2] Sam Johnson and Mark Everingham. "Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation." BMVC (2010).

Owner
Guanghan Ning
Guanghan Ning
Neural Radiance Fields Using PyTorch

This project is a PyTorch implementation of Neural Radiance Fields (NeRF) for reproduction of results whilst running at a faster speed.

Vedant Ghodke 1 Feb 11, 2022
Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code

Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code.

Yasunori Shimura 7 Jul 27, 2022
The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation

PointNav-VO The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation Project Page | Paper Table of Contents Setup

Xiaoming Zhao 41 Dec 15, 2022
Patch-Based Deep Autoencoder for Point Cloud Geometry Compression

Patch-Based Deep Autoencoder for Point Cloud Geometry Compression Overview The ever-increasing 3D application makes the point cloud compression unprec

17 Dec 05, 2022
Development kit for MIT Scene Parsing Benchmark

Development Kit for MIT Scene Parsing Benchmark [NEW!] Our PyTorch implementation is released in the following repository: https://github.com/hangzhao

MIT CSAIL Computer Vision 424 Dec 01, 2022
Utility code for use with PyXLL

pyxll-utils There is no need to use this package as of PyXLL 5. All features from this package are now provided by PyXLL. If you were using this packa

PyXLL 10 Dec 18, 2021
Curvlearn, a Tensorflow based non-Euclidean deep learning framework.

English | 简体中文 Why Non-Euclidean Geometry Considering these simple graph structures shown below. Nodes with same color has 2-hop distance whereas 1-ho

Alibaba 123 Dec 12, 2022
ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

This is the project page for the paper: ISTR: End-to-End Instance Segmentation via Transformers, Jie Hu, Liujuan Cao, Yao Lu, ShengChuan Zhang, Yan Wa

Jie Hu 182 Dec 19, 2022
PyTorch Implementation for AAAI'21 "Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection"

UMS for Multi-turn Response Selection Implements the model described in the following paper Do Response Selection Models Really Know What's Next? Utte

Taesun Whang 47 Nov 22, 2022
Starter kit for getting started in the Music Demixing Challenge.

Music Demixing Challenge - Starter Kit 👉 Challenge page This repository is the Music Demixing Challenge Submission template and Starter kit! Clone th

AIcrowd 106 Dec 20, 2022
git《Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser》(2021) GitHub: [fig5]

Pseudo-ISP: Learning Pseudo In-camera Signal Processing Pipeline from A Color Image Denoiser Abstract The success of deep denoisers on real-world colo

Yue Cao 51 Nov 22, 2022
Code for ICLR 2020 paper "VL-BERT: Pre-training of Generic Visual-Linguistic Representations".

VL-BERT By Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, Jifeng Dai. This repository is an official implementation of the paper VL-BERT:

Weijie Su 698 Dec 18, 2022
[ICLR 2021] HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark

HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark Accepted as a spotlight paper at ICLR 2021. Table of content File structure Prerequi

72 Jan 03, 2023
Unofficial implement with paper SpeakerGAN: Speaker identification with conditional generative adversarial network

Introduction This repository is about paper SpeakerGAN , and is unofficially implemented by Mingming Huang ( 7 Jan 03, 2023

Dataset Condensation with Contrastive Signals

Dataset Condensation with Contrastive Signals This repository is the official implementation of Dataset Condensation with Contrastive Signals (DCC). T

3 May 19, 2022
Fine-Tune EleutherAI GPT-Neo to Generate Netflix Movie Descriptions in Only 47 Lines of Code Using Hugginface And DeepSpeed

GPT-Neo-2.7B Fine-Tuning Example Using HuggingFace & DeepSpeed Installation cd venv/bin ./pip install -r ../../requirements.txt ./pip install deepspe

Nikita 180 Jan 05, 2023
Generative Adversarial Networks for High Energy Physics extended to a multi-layer calorimeter simulation

CaloGAN Simulating 3D High Energy Particle Showers in Multi-Layer Electromagnetic Calorimeters with Generative Adversarial Networks. This repository c

Deep Learning for HEP 101 Nov 13, 2022
AttentionGAN for Unpaired Image-to-Image Translation & Multi-Domain Image-to-Image Translation

AttentionGAN-v2 for Unpaired Image-to-Image Translation AttentionGAN-v2 Framework The proposed generator learns both foreground and background attenti

Hao Tang 530 Dec 27, 2022
Process text, including tokenizing and representing sentences as vectors and Applying some concepts like RNN, LSTM and GRU to create a classifier can detect the language in which a sentence is written from among 17 languages.

Language Identifier What is this ? The goal of this project is to create a model that is able to predict a given sentence language through text proces

Hossam Asaad 9 Dec 15, 2022