Code for ICCV 2021 paper "HuMoR: 3D Human Motion Model for Robust Pose Estimation"

Related tags

Deep Learninghumor
Overview

HuMoR: 3D Human Motion Model for Robust Pose Estimation (ICCV 2021)

This is the official implementation for the ICCV 2021 paper. For more information, see the project webpage.

HuMoR Teaser

Environment Setup

Note: This code was developed on Ubuntu 16.04/18.04 with Python 3.7, CUDA 10.1 and PyTorch 1.6.0. Later versions should work, but have not been tested.

Create and activate a virtual environment to work in, e.g. using Conda:

conda create -n humor_env python=3.7
conda activate humor_env

Install CUDA and PyTorch 1.6. For CUDA 10.1, this would look like:

conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch

Install the remaining requirements with pip:

pip install -r requirements.txt

You must also have ffmpeg installed on your system to save visualizations.

Downloads & External Dependencies

This codebase relies on various external downloads in order to run for certain modes of operation. Here we briefly overview each and what they are used for. Detailed setup instructions are linked in other READMEs.

Body Model and Pose Prior

Detailed instructions to install SMPL+H and VPoser are in this documentation.

  • SMPL+H is used for the pose/shape body model. Downloading this model is necessary for all uses of this codebase.
  • VPoser is used as a pose prior only during the initialization phase of fitting, so it's only needed if you are using the test-time optimization functionality of this codebase.

Datasets

Detailed instructions to install, configure, and process each dataset are in this documentation.

  • AMASS motion capture data is used to train and evaluate (e.g. randomly sample) the HuMoR motion model and for fitting to 3D data like noisy joints and partial keypoints.
  • i3DB contains RGB videos with heavy occlusions and is only used in the paper to evaluate test-time fitting to 2D joints.
  • PROX contains RGB-D videos and is only used in the paper to evaluate test-time fitting to 2D joints and 3D point clouds.

Pretrained Models

Pretrained model checkpoints are available for HuMoR, HuMoR-Qual, and the initial state Gaussian mixture. To download (~215 MB), from the repo root run bash get_ckpt.sh.

OpenPose

OpenPose is used to detect 2D joints for fitting to arbitrary RGB videos. If you will be running test-time optimization on the demo video or your own videos, you must install OpenPose. To clone and build, please follow the OpenPose README in their repo.

Optimization in run_fitting.py assumes OpenPose is installed at ./external/openpose by default - if you install elsewhere, please pass in the location using the --openpose flag.

Fitting to RGB Videos (Test-Time Optimization)

To run motion/shape estimation on an arbitrary RGB video, you must have SMPL+H, VPoser, OpenPose, and a pretrained HuMoR model as detailed above. We have included a demo video in this repo along with a few example configurations to get started.

Note: if running on your own video, make sure the camera is not moving and the person is not interacting with uneven terrain in the scene (we assume a single ground plane). Also, only one person will be reconstructed.

To run the optimization on the demo video use:

python humor/fitting/run_fitting.py @./configs/fit_rgb_demo_no_split.cfg

This configuration optimizes over the entire video (~3 sec) at once (i.e. over all frames). If your video is longer than 2-3 sec, it is recommended to instead use the settings in ./configs/fit_rgb_demo_use_split.cfg which adds the --rgb-seq-len, --rgb-overlap-len, and --rgb-overlap-consist-weight arguments. Using this configuration, the input video is split into multiple overlapping sub-sequences and optimized in a batched fashion (with consistency losses between sub-sequences). This increases efficiency, and lessens the need to tune parameters based on video length. Note the larger the batch size, the better the results will be.

If known, it's highly recommended to pass in camera intrinsics using the --rgb-intrinsics flag. See ./configs/intrinsics_default.json for an example of what this looks like. If intrinsics are not given, default focal lengths are used.

Finally, this demo does not use PlaneRCNN to initialize the ground as described in the paper. Instead, it roughly initializes the ground at y = 0.5 (with camera up-axis -y). We found this to be sufficient and often better than using PlaneRCNN. If you want to use PlaneRCNN instead, set up a separate environment, follow their install instructions, then use the following command to run their method where example_image_dir contains a single frame from your video and the camera parameters: python evaluate.py --methods=f --suffix=warping_refine --dataset=inference --customDataFolder=example_image_dir. The results directory can be passed into our optimization using the --rgb-planercnn-res flag.

Visualizing RGB Results

The optimization is performed in 3 stages, with stages 1 & 2 being initialization using a pose prior and smoothing (i.e. the VPoser-t baseline) and stage 3 being the full optimization with the HuMoR motion prior. So for the demo, the final output for the full sequence will be saved in ./out/rgb_demo_no_split/results_out/final_results/stage3_results.npz. To visualize results from the fitting use something like:

python humor/fitting/viz_fitting_rgb.py  --results ./out/rgb_demo_no_split/results_out --out ./out/rgb_demo_no_split/viz_out --viz-prior-frame

By default, this will visualize the final full video result along with each sub-sequence separately (if applicable). Please use --help to see the many additional visualization options. This code is also useful to see how to load in and use the results for other tasks, if desired.

Fitting on Specific Datasets

Next, we detail how to run and evaluate the test-time optimization on the various datasets presented in the paper. In all these examples, the default batch size is quite small to accomodate smaller GPUs, but it should be increased depending on your system.

AMASS 3D Data

There are multiple settings possible for fitting to 3D data (e.g. noisy joints, partial keypoints, etc...), which can be specified using configuration flags. For example, to fit to partial upper-body 3D keypoints sampled from AMASS data, run:

python humor/fitting/run_fitting.py @./configs/fit_amass_keypts.cfg

Optimization results can be visualized using

python humor/fitting/eval_fitting_3d.py --results ./out/amass_verts_upper_fitting/results_out --out ./out/amass_verts_upper_fitting/eval_out  --qual --viz-stages --viz-observation

and evaluation metrics computed with

python humor/fitting/eval_fitting_3d.py --results ./out/amass_verts_upper_fitting/results_out --out ./out/amass_verts_upper_fitting/eval_out  --quant --quant-stages

The most relevant quantitative results will be written to eval_out/eval_quant/compare_mean.csv.

i3DB RGB Data

The i3DB dataset contains RGB videos with many occlusions along with annotated 3D joints for evaluation. To run test-time optimization on the full dataset, use:

python humor/fitting/run_fitting.py @./configs/fit_imapper.cfg

Results can be visualized using the same script as in the demo:

python humor/fitting/viz_fitting_rgb.py  --results ./out/imapper_fitting/results_out --out ./out/imapper_fitting/viz_out --viz-prior-frame

Quantitative evaluation (comparing to results after each optimization stage) can be run with:

python humor/fitting/eval_fitting_2d.py --results ./out/imapper_fitting/results_out --dataset iMapper --imapper-floors ./data/iMapper/i3DB/floors --out ./out/imapper_fitting/eval_out --quant --quant-stages

The final quantitative results will be written to eval_out/eval_quant/compare_mean.csv.

PROX RGB/RGB-D Data

PROX contains RGB-D data so affords fitting to just 2D joints and 2D joints + 3D point cloud. The commands for running each of these are quite similar, just using different configuration files. For running on the full RGB-D data, use:

python humor/fitting/run_fitting.py @./configs/fit_proxd.cfg

Visualization must add the --flip-img flag to align with the original PROX videos:

python humor/fitting/viz_fitting_rgb.py  --results ./out/proxd_fitting/results_out --out ./out/proxd_fitting/viz_out --viz-prior-frame --flip-img

Quantitative evalution (of plausibility metrics) for full RGB-D data uses

python humor/fitting/eval_fitting_2d.py --results ./out/proxd_fitting/results_out --dataset PROXD --prox-floors ./data/prox/qualitative/floors --out ./out/proxd_fitting/eval_out --quant --quant-stages

and for just RGB data is slightly different:

python humor/fitting/eval_fitting_2d.py --results ./out/prox_fitting/results_out --dataset PROX --prox-floors ./data/prox/qualitative/floors --out ./out/prox_fitting/eval_out --quant --quant-stages

Training & Testing Motion Model

There are two versions of our model: HuMoR and HuMoR-Qual. HuMoR is the main model presented in the paper and is best suited for test-time optimization. HuMoR-Qual is a slight variation on HuMoR that gives more stable and qualitatively superior results for random motion generation (see the paper for details).

Below we describe how to train and test HuMoR, but the exact same commands are used for HuMoR-Qual with a different configuration file at each step (see all provided configs).

Training HuMoR

To train HuMoR from scratch, make sure you have the processed version of the AMASS dataset at ./data/amass_processed and run:

python humor/train/train_humor.py @./configs/train_humor.cfg

The default batch size is meant for a 16 GB GPU.

Testing HuMoR

After training HuMoR or downloading the pretrained checkpoints, we can evaluate the model in multiple ways

To compute single-step losses (the exact same as during training) over the entire test set run:

python humor/test/test_humor.py @./configs/test_humor.cfg

To randomly sample a motion sequence and save a video visualization, run:

python humor/test/test_humor.py @./configs/test_humor_sampling.cfg

If you'd rather visualize the sampling results in an interactive viewer, use:

python humor/test/test_humor.py @./configs/test_humor_sampling_debug.cfg

Try adding --viz-pred-joints, --viz-smpl-joints, or --viz-contacts to the end of the command to visualize more outputs, or increasing the value of --eval-num-samples to sample the model multiple times from the same initial state. --help can always be used to see all flags and their descriptions.

Training Initial State GMM

Test-time optimization also uses a Gaussian mixture model (GMM) prior over the initial state of the sequence. The pretrained model can be downloaded above, but if you wish to train from scratch, run:

python humor/train/train_state_prior.py --data ./data/amass_processed --out ./out/init_state_prior_gmm --gmm-comps 12

Citation

If you found this code or paper useful, please consider citing:

@inproceedings{rempe2021humor,
    author={Rempe, Davis and Birdal, Tolga and Hertzmann, Aaron and Yang, Jimei and Sridhar, Srinath and Guibas, Leonidas J.},
    title={HuMoR: 3D Human Motion Model for Robust Pose Estimation},
    booktitle={International Conference on Computer Vision (ICCV)},
    year={2021}
}

Questions?

If you run into any problems or have questions, please create an issue or contact Davis (first author) via email.

Owner
Davis Rempe
Davis Rempe
Human head pose estimation using Keras over TensorFlow.

RealHePoNet: a robust single-stage ConvNet for head pose estimation in the wild.

Rafael Berral Soler 71 Jan 05, 2023
学习 python3 以来写的一些垃圾玩具……

和东哥做兄弟 Author: chiupam 版权 未经本人同意,仓库内所有资源文件,禁止任何公众号、自媒体、开发者进行任何形式的转载、发布、搬运。 声明 这不是一个开源项目,只是把 GitHub 当作一个代码的存储空间,本项目不接受任何开源要求。 仅用于学习研究,禁止用于商业用途,不能保证其合法性

Chiupam 67 Mar 26, 2022
MMFlow is an open source optical flow toolbox based on PyTorch

Documentation: https://mmflow.readthedocs.io/ Introduction English | 简体中文 MMFlow is an open source optical flow toolbox based on PyTorch. It is a part

OpenMMLab 688 Jan 06, 2023
Two-stage CenterNet

Probabilistic two-stage detection Two-stage object detectors that use class-agnostic one-stage detectors as the proposal network. Probabilistic two-st

Xingyi Zhou 1.1k Jan 03, 2023
Generic image compressor for machine learning. Pytorch code for our paper "Lossy compression for lossless prediction".

Lossy Compression for Lossless Prediction Using: Training: This repostiory contains our implementation of the paper: Lossy Compression for Lossless Pr

Yann Dubois 84 Jan 02, 2023
Aws-machine-learning-university-accelerated-tab - Machine Learning University: Accelerated Tabular Data Class

Machine Learning University: Accelerated Tabular Data Class This repository contains slides, notebooks, and datasets for the Machine Learning Universi

AWS Samples 916 Dec 23, 2022
Implementation of ICCV19 Paper "Learning Two-View Correspondences and Geometry Using Order-Aware Network"

OANet implementation Pytorch implementation of OANet for ICCV'19 paper "Learning Two-View Correspondences and Geometry Using Order-Aware Network", by

Jiahui Zhang 225 Dec 05, 2022
Official code for our EMNLP2021 Outstanding Paper MindCraft: Theory of Mind Modeling for Situated Dialogue in Collaborative Tasks

MindCraft Authors: Cristian-Paul Bara*, Sky CH-Wang*, Joyce Chai This is the official code repository for the paper (arXiv link): Cristian-Paul Bara,

Situated Language and Embodied Dialogue (SLED) Research Group 14 Dec 29, 2022
PyTorch implementation of SQN based on CloserLook3D's encoder

SQN_pytorch This repo is an implementation of Semantic Query Network (SQN) using CloserLook3D's encoder in Pytorch. For TensorFlow implementation, che

PointCloudYC 1 Oct 21, 2021
Training Very Deep Neural Networks Without Skip-Connections

DiracNets v2 update (January 2018): The code was updated for DiracNets-v2 in which we removed NCReLU by adding per-channel a and b multipliers without

Sergey Zagoruyko 585 Oct 12, 2022
Official pytorch code for SSC-GAN: Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation(ICCV 2021)

SSC-GAN_repo Pytorch implementation for 'Semi-Supervised Single-Stage Controllable GANs for Conditional Fine-Grained Image Generation'.PDF SSC-GAN:Sem

tyty 4 Aug 28, 2022
(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework Background: Outlier detection (OD) is a key data mining task for identify

Yue Zhao 127 Jan 05, 2023
Classical OCR DCNN reproduction based on PaddlePaddle framework.

Paddle-SVHN Classical OCR DCNN reproduction based on PaddlePaddle framework. This project reproduces Multi-digit Number Recognition from Street View I

1 Nov 12, 2021
Code for Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations

Implementation for Iso-Points (CVPR 2021) Official code for paper Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations paper |

Yifan Wang 66 Nov 08, 2022
Official Pytorch implementation of "Unbiased Classification Through Bias-Contrastive and Bias-Balanced Learning (NeurIPS 2021)

Unbiased Classification Through Bias-Contrastive and Bias-Balanced Learning (NeurIPS 2021) Official Pytorch implementation of Unbiased Classification

Youngkyu 17 Jan 01, 2023
Code for Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions

EMS-COLS-recourse Initial Code for Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions Folder structure: data folder contains raw an

Prateek Yadav 1 Nov 25, 2022
Taichi Course Homework Template

太极图形课S1-标题部分 这个作业未来或将是你的开源项目,标题的内容可以来自作业中的核心关键词,让读者一眼看出你所完成的工作/做出的好玩demo 如果暂时未想好,起名时可以参考“太极图形课S1-xxx作业” 如下是作业(项目)展开说明的方法,可以帮大家理清思路,并且也对读者非常友好,请小伙伴们多多参

TaichiCourse 30 Nov 19, 2022
Build Low Code Automated Tensorflow, What-IF explainable models in just 3 lines of code.

Build Low Code Automated Tensorflow explainable models in just 3 lines of code.

Hasan Rafiq 170 Dec 26, 2022
Reading list for research topics in Masked Image Modeling

awesome-MIM Reading list for research topics in Masked Image Modeling(MIM). We list the most popular methods for MIM, if I missed something, please su

ligang 231 Dec 07, 2022
Saliency - Framework-agnostic implementation for state-of-the-art saliency methods (XRAI, BlurIG, SmoothGrad, and more).

Saliency Methods 🔴 Now framework-agnostic! (Example core notebook) 🔴 🔗 For further explanation of the methods and more examples of the resulting ma

PAIR code 849 Dec 27, 2022