[BMVC2021] "TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation"

Last update: Dec 23, 2022

Overview

TransFusion-Pose

TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation
Haoyu Ma, Liangjian Chen, Deying Kong, Zhe Wang, Xingwei Liu, Hao Tang, Xiangyi Yan, Yusheng Xie, Shih-Yao Lin and Xiaohui Xie
In BMVC 2021
[Paper] [Video]

Overview

We propose the TransFusion, which apply the transformer architecture to multi-view 3D human pose estimation
We propose the Epipolar Field, a novel and more general form of epipolar line. It readily integrates with the transformer through our proposed geometry positional encoding to encode the 3D relationships among different views.
Extensive experiments are conducted to demonstrate that our TransFusion outperforms previous fusion methods on both Human 3.6M and SkiPose datasets, but requires substantially fewer parameters.

Installation

Clone this repo, and we'll call the directory that you cloned multiview-pose as ${POSE_ROOT}

git clone https://github.com/HowieMa/TransFusion-Pose.git

Install dependencies.

pip install -r requirements.txt

Download TransPose models pretrained on COCO.

wget https://github.com/yangsenius/TransPose/releases/download/Hub/tp_r_256x192_enc3_d256_h1024_mh8.pth

You can also download it from the official website of TransPose

Please download them under ${POSE_ROOT}/models, and make them look like this:

${POSE_ROOT}/models
└── pytorch
    └── coco
        └── tp_r_256x192_enc3_d256_h1024_mh8.pth

Data preparation

Human 3.6M

For Human36M data, please follow H36M-Toolbox to prepare images and annotations.

Ski-Pose

For Ski-Pose, please follow the instruction from their website to obtain the dataset.
Once you download the Ski-PosePTZ-CameraDataset-png.zip and ski_centers.csv, unzip them and put into the same folder, named as ${SKI_ROOT}.
Run python data/preprocess_skipose.py ${SKI_ROOT} to format it.

Your folder should look like this:

${POSE_ROOT}
|-- data
|-- |-- h36m
    |-- |-- annot
        |   |-- h36m_train.pkl
        |   |-- h36m_validation.pkl
        |-- images
            |-- s_01_act_02_subact_01_ca_01 
            |-- s_01_act_02_subact_01_ca_02

|-- |-- preprocess_skipose.py
|-- |-- skipose  
    |-- |-- annot
        |   |-- ski_train.pkl
        |   |-- ski_validation.pkl
        |-- images
            |-- seq_103 
            |-- seq_103

Training and Testing

Human 3.6M

# Training
python run/pose2d/train.py --cfg experiments-local/h36m/transpose/256_fusion_enc3_GPE.yaml --gpus 0,1,2,3

# Evaluation (2D)
python run/pose2d/valid.py --cfg experiments-local/h36m/transpose/256_fusion_enc3_GPE.yaml --gpus 0,1,2,3  

# Evaluation (3D)
python run/pose3d/estimate_tri.py --cfg experiments-local/h36m/transpose/256_fusion_enc3_GPE.yaml

Ski-Pose

# Training
python run/pose2d/train.py --cfg experiments-local/skipose/transpose/256_fusion_enc3_GPE.yaml --gpus 0,1,2,3

# Evaluation (2D)
python run/pose2d/valid.py --cfg experiments-local/skipose/transpose/256_fusion_enc3_GPE.yaml --gpus 0,1,2,3

# Evaluation (3D)
python run/pose3d/estimate_tri.py --cfg experiments-local/skipose/transpose/256_fusion_enc3_GPE.yaml

Our trained models can be downloaded from here

Citation

If you find our code helps your research, please cite the paper:

@inproceedings{ma2021transfusion,
  title={TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation},
  author={Ma, Haoyu and Chen, Liangjian and Kong, Deying and Wang, Zhe and Liu, Xingwei and Tang, Hao and Yan, Xiangyi and Xie, Yusheng and Lin, Shih-Yao and Xie, Xiaohui},
  booktitle={British Machine Vision Conference},
  year={2021}
}

[BMVC2021] "TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation"

Related tags

Overview

TransFusion-Pose

Overview

Installation

Data preparation

Human 3.6M

Ski-Pose

Training and Testing

Human 3.6M

Ski-Pose

Citation

Acknowledgement

Owner

Haoyu Ma

The datasets and code of ACL 2021 paper "Aspect-Category-Opinion-Sentiment Quadruple Extraction with Implicit Aspects and Opinions".

Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds

Implementation of Pooling by Sliced-Wasserstein Embedding (NeurIPS 2021)

Official repo of the paper "Surface Form Competition: Why the Highest Probability Answer Isn't Always Right"

Implementation of ReSeg using PyTorch

Franka Emika Panda manipulator kinematics&dynamics simulation

Equipped customers with insights about their EVs Hourly energy consumption and helped predict future charging behavior using LSTM model

My take on a practical implementation of Linformer for Pytorch.

[NeurIPS 2021] Low-Rank Subspaces in GANs

MGFN: Multi-Graph Fusion Networks for Urban Region Embedding was accepted by IJCAI-2022.

Reimplementation of the paper `Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words? (ACL2020)`

ECLARE: Extreme Classification with Label Graph Correlations

QRec: A Python Framework for quick implementation of recommender systems (TensorFlow Based)

Styleformer - Official Pytorch Implementation

Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

BLEURT is a metric for Natural Language Generation based on transfer learning.

Action Recognition for Self-Driving Cars

Pytorch implementation code for [Neural Architecture Search for Spiking Neural Networks]

ML-PersonalWork - Big assignment PersonalWork in Machine Learning, 2021 autumn BUAA.

RID-Noise: Towards Robust Inverse Design under Noisy Environments