GAN-based 3D human pose estimation model for 3DV'17 paper

Overview

Tensorflow implementation for 3DV 2017 conference paper "Adversarially Parameterized Optimization for 3D Human Pose Estimation".

@inproceedings{jack2017adversarially,
  title={Adversarially Parameterized Optimization for 3D Human Pose Estimation},
  author={Jack, Dominic and Maire, Frederic and Eriksson, Anders and Shirazi, Sareh},
  booktitle={3D Vision (3DV), 2017 Fifth International Conference on},
  year={2017},
  organization={IEEE}
}

Code used to generate results for the paper has been frozen and can be found in the 3dv2017 branch. Bug fixes and extensions will be applied to other branches.

Algorithm Overview

The premise of the paper is to train a GAN to simultaneously learn a parameterization of the feasible human pose space along with a feasibility loss function.

During inference, a standard off-the-shelf optimizer infers all poses from sequence almost-independently (the scale is shared between frames, which has no effect on the results (since errors are on the procruste-aligned inferences which optimize over scale) but makes the visualizations easier to interpret).

Repository Structure

Each GAN is identified by a gan_id. Hyperparameters defining the network structures and datasets from which they should be trained are specified in gan_params/gan_id.json. A couple (those with results highlighted in the paper) are provided, h3m_big, h3m_small and eva_big. Note that compared to typical neural networks, these are still tiny, so the difference in size should result in a negligible difference in training/inference time.

Similarly, each inference run is identified by an inference_id, the parameters of which are defined in inference_params/inference_id.json. including geometric transforms, visualizations and dataset reading

  • gan: provides application-specific GANs based on specifications in gan_params
  • serialization.py: i/o related functions for loading hyper-parameters/results

Scripts:

  • train.py: Trains a GAN specified by a json file in gan_params
  • gan_generator_vis.py: visualization script for a trained GAN generator
  • interactive_gan_generator_vis.ipynb: interactive jupyter/ipython notebook for visualizing a trained GAN generator
  • generate_inferences.py: Generates inferences based on parameters specified by a json file in inference_params
  • h3m_report.py/eva_report.py: reporting scripts for generated inferences.
  • vis_sequecne.py: visualization script for entire inferred sequence.

Usage

  1. Setup the external repositories: * human_pose_util
  2. Clone this repository and add the location and the parent directory(s) to your PYTHONPATH
cd path/to/parent_folder
git clone https://github.com/jackd/adversarially_parameterized_optimization.git
git clone https://github.com/jackd/human_pose_util.git
export PYTHONPATH=/path/to/parent_folder:$PYTHONPATH
cd adversarially_parameterized_optimization
  1. Define a GAN model by creating a gan_params/gan_id.json file, or select one of the existing ones.
  2. Setup the relevant dataset(s) or create your own as described in human_pose_util.
  3. Train the GAN
python train.py gan_id --max_steps=1e7

Our experiments were conducted on an NVidia K620 Quadro GPU with 2GB memory. Training runs at ~600 batches per second with a batch size of 128. For 10 million steps (likely excessive) this takes around 4.5 hours.

View training progress and compare different runs using tensorboard:

tensorboard --logdir=models
  1. (Optional) Check your generator is behaving well by running gan_generator_vis.py model_id or interactively by running interactive_gan_generator_vis.ipynb and modifying the model_id.
  2. Define an inference specification by creating an inference_params/inference_id.json file, or select one of the defaults provided.
  3. Generate inference
python generate_inferences.py inference_id

Sequence optimization runs at ~5-10fps (speed-up compared to 1fps reported in paper due to reimplementation efficiencies rather than different ideas).

This will save results in results.hdf5 in the inference_id group. 9. See the results! * h3m_report.py or eva_report.py depending on the dataset gives qualitative results

python report.py eval_id
* `vis_sequence.py` visualizes inferences

Note that results are quite unstable with respect to GAN training. You may get considerably different quantitative results than those published in the paper, though qualitative behaviour should be similar.

Serialization

To aid with experiments with different parameter sets, model/inference parameters are saved in json for ease of parsing and human readability. To allow for extensibility, human_pose_util maintains registers for different datasets and skeletons.

See the README for details on setting up/preprocessing of datasets or implementing your own.

The scripts in this project register some default h3m/eva datasets using register_defaults. While normally fast, some data conversion is performed the first time this function is run for each dataset and requires the original datasets be available with paths defined (see below). If you only wish to experiment with one dataset -- e.g. h3m -- modify the default argument values for register_defaults, e.g. def register_defaults(h3m=True, eva=False): (or the relevant function calls).

If you implement your own datasets/skeletons, either add their registrations to the default functions, or edit the relevant scripts to register them manually.

Datasets

See human_pose_util repository for instructions for setting up datasets.

Requirements

For training/inference:

  • tensorflow 1.4
  • numpy
  • h5py For visualizations:
  • matplotlib
  • glumpy (install from source may reduce issues) For initial human 3.6m dataset transformations:
  • spacepy (for initial human 3.6m dataset conversion to hdf5)

Development

This branch will be actively maintained, updated and extended. For code used to generate results for the publication, see the 3dv2017 branch.

Contact

Please report any issues/bugs. Feature requests in this repository will largely be ignored, but will be considered if made in independent repositories.

Email contact to discuss ideas/collaborations welcome: [email protected].

Owner
Dominic Jack
Deep Learning / Cybsecurity Researcher
Dominic Jack
A repository built on the Flow software package to explore cyber-security attacks on intelligent transportation systems.

A repository built on the Flow software package to explore cyber-security attacks on intelligent transportation systems.

George Gunter 4 Nov 14, 2022
Data and extra materials for the food safety publications classifier

Data and extra materials for the food safety publications classifier The subdirectories contain detailed descriptions of their contents in the README.

1 Jan 20, 2022
[CVPR2021 Oral] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers This is the official PyTorch implementation and models for UP-DETR paper: @a

dddzg 430 Dec 23, 2022
Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering

BPR Binary Passage Retriever (BPR) is an efficient neural retrieval model for open-domain question answering. BPR integrates a learning-to-hash techni

Studio Ousia 147 Dec 07, 2022
ReGAN: Sequence GAN using RE[INFORCE|LAX|BAR] based PG estimators

Sequence Generation with GANs trained by Gradient Estimation Requirements: PyTorch v0.3 Python 3.6 CUDA 9.1 (For GPU) Origin The idea is from paper Se

40 Nov 03, 2022
This repository contains the segmentation user interface from the OpenSurfaces project, extracted as a lightweight tool

OpenSurfaces Segmentation UI This repository contains the segmentation user interface from the OpenSurfaces project, extracted as a lightweight tool.

Sean Bell 66 Jul 11, 2022
The MLOps platform for innovators 🚀

​ DS2.ai is an integrated AI operation solution that supports all stages from custom AI development to deployment. It is an AI-specialized platform service that collects data, builds a training datas

9 Jan 03, 2023
The sixth place winning solution (6/220) in 2021 Gaofen Challenge.

SwinTransformer + OBBDet The sixth place winning solution (6/220) in the track of Fine-grained Object Recognition in High-Resolution Optical Images, 2

ming71 46 Dec 02, 2022
PyTorch Kafka Dataset: A definition of a dataset to get training data from Kafka.

PyTorch Kafka Dataset: A definition of a dataset to get training data from Kafka.

ERTIS Research Group 7 Aug 01, 2022
Time should be taken seer-iously

TimeSeers seers - (Noun) plural form of seer - A person who foretells future events by or as if by supernatural means TimeSeers is an hierarchical Bay

279 Dec 26, 2022
Implementation of ConvMixer-Patches Are All You Need? in TensorFlow and Keras

Patches Are All You Need? - ConvMixer ConvMixer, an extremely simple model that is similar in spirit to the ViT and the even-more-basic MLP-Mixer in t

Sayan Nath 8 Oct 03, 2022
Transformers are Graph Neural Networks!

🚀 Gated Graph Transformers Gated Graph Transformers for graph-level property prediction, i.e. graph classification and regression. Associated article

Chaitanya Joshi 46 Jun 30, 2022
ServiceX Transformer that converts flat ROOT ntuples into columnwise data

ServiceX_Uproot_Transformer ServiceX Transformer that converts flat ROOT ntuples into columnwise data Usage You can invoke the transformer from the co

Vis 0 Jan 20, 2022
PyTorch implementation of PP-LCNet

PP-LCNet-Pytorch Pre-Trained Models Google Drive p018 Accuracy Models Top1 Top5 PPLCNet_x0_25 0.5186 0.7565 PPLCNet_x0_35 0.5809 0.8083 PPLCNet_x0_5 0

24 Dec 12, 2022
Implementation for Learning to Track with Object Permanence

Learning to Track with Object Permanence A video-based MOT approach capable of tracking through full occlusions: Learning to Track with Object Permane

Toyota Research Institute - Machine Learning 91 Jan 03, 2023
Dense matching library based on PyTorch

Dense Matching A general dense matching library based on PyTorch. For any questions, issues or recommendations, please contact Prune at

Prune Truong 399 Dec 28, 2022
Implementation of the paper Recurrent Glimpse-based Decoder for Detection with Transformer.

REGO-Deformable DETR By Zhe Chen, Jing Zhang, and Dacheng Tao. This repository is the implementation of the paper Recurrent Glimpse-based Decoder for

Zhe Chen 33 Nov 30, 2022
🗺 General purpose U-Network implemented in Keras for image segmentation

TF-Unet General purpose U-Network implemented in Keras for image segmentation Getting started • Training • Evaluation Getting started Looking for Jupy

Or Fleisher 2 Aug 31, 2022
Unimodal Face Classification with Multimodal Training

Unimodal Face Classification with Multimodal Training This is a PyTorch implementation of the following paper: Unimodal Face Classification with Multi

Wenbin Teng 3 Jul 06, 2022
Graph-total-spanning-trees - A Python script to get total number of Spanning Trees in a Graph

Total number of Spanning Trees in a Graph This is a python script just written f

Mehdi I. 0 Jul 18, 2022