4D Human Body Capture from Egocentric Video via 3D Scene Grounding

Last update: Nov 08, 2022

Related tags

Overview

4D Human Body Capture from Egocentric Video via 3D Scene Grounding

Installation:

Our method requires the same dependencies as SMPLify-X and OpenPose. We refer to the official implementation fo SMPLify-X and OpenPose for installation details.

Our method also needs the installation of Chamfer Pytorch to calculate the chamfer distnace for enforceing human-scene constraints

Data Preparation:

Step 1: Dump video frames with desired fps (30) with utils/dump_videos.py. Run utils/split_frames to segment videos into equally long subatom clips. Repack frames to videos with utils/pack_videos.py (This is for faster openpose I/O).

Step 2: Run openpose_call.py under openpose folder to get human body keypoints, then run utils/openpose_helper to rename keypoint.json and run utils/openpose_filter.py to keep the most confident human keypoints.

Step 3: Run Smplify-X model with specified focal length and data directory. This step may take up to several hours. For instance:

python3 smplifyx/main.py --config cfg_files/fit_smplx.yaml  --data_folder /home/miao/data/rylm/downsampled_frames/miao_mainbuilding_0-1 --output_folder /home/miao/data/rylm/downsampled_frames/miao_mainbuilding_0-1/body_gen --visualize="False" --model_folder ./models --vposer_ckpt ./vposer --part_segm_fn smplx_parts_segm.pkl --focal_length 694.0

Step 4: Run Colmap for to generate scene mesh and camera trajectory. This step make take up to several hours depneding on the complexity of the scene. Then Run utils/camerpose_helper and utils/pointscloud_helper.py to generate desired points cloud file and camera pose.

Joint Optimization with 3D Scene Context:

Run global_optimization.py to conduct temproal smoothing and enforce human-scene constraints:

python3 global_optimization.py '/home/miao/data/rylm/packed_data/miao_mainbuidling_0-1/body_gen' '/home/miao/data/rylm/packed_data/miao_mainbuidling_0-1/smoothed_body

The resulting data should be organized as following:

datafolder:
- videoname:
  - images: folder that contains all video frames
  - keypoints: folder that contains all body keypoints
  - body_gen: folder that contains all body mesh files:
  - smoothed_boyd: folder that contains all jointly-optimized body mesh files:
  - camera_pose.txt: text file that contains camera pose at each temporal footprint
  - meshed-poisson.ply: scene mesh file from dense reconstruction
  - camera.txt: text file that contains camera parameters
  - xyz.ply point cloud file. (use meash lab to convert .xyz file to .ply file)

Visualization in the World Coordinate:

Run global_vis.py to transform the body mesh in pivot coordinate to world coordinate. By default the viewpoint of open3d is the initial position camera trajectory. Setting bool flag to 'True' will resulting into a open3d viewpoint moving the same way as camera viewer.

python3 global_vis.py '/home/miao/data/rylm/downsampled_frames/miao_mainbuilding_0-1/' False

Visualization in the Egocentric Coordinate:

Run vis.py to view recosntrcuted body mesh on image plane.

python3 vis.py '/home/miao/data/rylm/segmented_data/miao_mainbuilding_0-1/'

Citation

If you find our code useful in your research, please use the following BibTeX entry for citation.

@inproceedings{liu20204d,
  title={4D Human Body Capture from Egocentric Video via 3D Scene Grounding},
  author={Liu, Miao and Yang, Dexin and Zhang, Yan and Cui, Zhaopeng and Rehg, James M and Tang, Siyu},
  booktitle={3DV},
  year={2021}
}

4D Human Body Capture from Egocentric Video via 3D Scene Grounding

Related tags

Overview

4D Human Body Capture from Egocentric Video via 3D Scene Grounding

Installation:

Data Preparation:

Joint Optimization with 3D Scene Context:

Visualization in the World Coordinate:

Visualization in the Egocentric Coordinate:

Citation

Owner

Miao Liu

SGoLAM - Simultaneous Goal Localization and Mapping

A PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detection.

a delightful machine learning tool that allows you to train, test and use models without writing code

pcnaDeep integrates cutting-edge detection techniques with tracking and cell cycle resolving models.

Automatic Idiomatic Expression Detection

Predict stock movement with Machine Learning and Deep Learning algorithms

Code for 'Self-Guided and Cross-Guided Learning for Few-shot segmentation. (CVPR' 2021)'

A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding his way.

Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer"

ZSL-KG is a general-purpose zero-shot learning framework with a novel transformer graph convolutional network (TrGCN) to learn class representation from common sense knowledge graphs.

RIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation

A Light in the Dark: Deep Learning Practices for Industrial Computer Vision

Learnable Multi-level Frequency Decomposition and Hierarchical Attention Mechanism for Generalized Face Presentation Attack Detection

The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

Projects of Andfun Yangon

Spatio-Temporal Entropy Model (STEM) for end-to-end leaned video compression.

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Domain Generalization with MixStyle, ICLR'21.

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

A small library for doing fluid simulation with neural networks.