Official implementation of MSR-GCN (ICCV 2021 paper)

Overview

MSR-GCN

Official implementation of MSR-GCN: Multi-Scale Residual Graph Convolution Networks for Human Motion Prediction (ICCV 2021 paper)

[Paper] [Supp] [Poster] [Slides]

Authors

  1. Lingwei Dang, School of Computer Science and Engineering, South China University of Technology, China, [email protected]
  2. Yongwei Nie, School of Computer Science and Engineering, South China University of Technology, China, [email protected]
  3. Chengjiang Long, JD Finance America Corporation, USA, [email protected]
  4. Qing Zhang, School of Computer Science and Engineering, Sun Yat-sen University, China, [email protected]
  5. Guiqing Li, School of Computer Science and Engineering, South China University of Technology, China, [email protected]

Overview

    Human motion prediction is a challenging task due to the stochasticity and aperiodicity of future poses. Recently, graph convolutional network (GCN) has been proven to be very effective to learn dynamic relations among pose joints, which is helpful for pose prediction. On the other hand, one can abstract a human pose recursively to obtain a set of poses at multiple scales. With the increase of the abstraction level, the motion of the pose becomes more stable, which benefits pose prediction too. In this paper, we propose a novel multi-scale residual Graph Convolution Network (MSR-GCN) for human pose prediction task in the manner of end-to-end. The GCNs are used to extract features from fine to coarse scale and then from coarse to fine scale. The extracted features at each scale are then combined and decoded to obtain the residuals between the input and target poses. Intermediate supervisions are imposed on all the predicted poses, which enforces the network to learn more representative features. Our proposed approach is evaluated on two standard benchmark datasets, i.e., the Human3.6M dataset and the CMU Mocap dataset. Experimental results demonstrate that our method outperforms the state-of-the-art approaches.

Dependencies

  • Pytorch 1.7.0+cu110
  • Python 3.8.5
  • Nvidia RTX 3090

Get the data

Human3.6m in exponential map can be downloaded from here.

CMU mocap was obtained from the repo of ConvSeq2Seq paper.

About datasets

Human3.6M

  • A pose in h3.6m has 32 joints, from which we choose 22, and build the multi-scale by 22 -> 12 -> 7 -> 4 dividing manner.
  • We use S5 / S11 as test / valid dataset, and the rest as train dataset, testing is done on the 15 actions separately, on each we use all data instead of the randomly selected 8 samples.
  • Some joints of the origin 32 have the same position
  • The input / output length is 10 / 25

CMU Mocap dataset

  • A pose in cmu has 38 joints, from which we choose 25, and build the multi-scale by 25 -> 12 -> 7 -> 4 dividing manner.
  • CMU does not have valid dataset, testing is done on the 8 actions separately, on each we use all data instead of the random selected 8 samples.
  • Some joints of the origin 38 have the same position
  • The input / output length is 10 / 25

Train

  • train on Human3.6M:

    python main.py --exp_name=h36m --is_train=1 --output_n=25 --dct_n=35 --test_manner=all

  • train on CMU Mocap:

    python main.py --exp_name=cmu --is_train=1 --output_n=25 --dct_n=35 --test_manner=all

Evaluate and visualize results

  • evaluate on Human3.6M:

    python main.py --exp_name=h36m --is_load=1 --model_path=ckpt/pretrained/h36m_in10out25dctn35_best_err57.9256.pth --output_n=25 --dct_n=35 --test_manner=all

  • evaluate on CMU Mocap:

    python main.py --exp_name=cmu --is_load=1 --model_path=ckpt/pretrained/cmu_in10out25dctn35_best_err37.2310.pth --output_n=25 --dct_n=35 --test_manner=all

Results

H3.6M-10/25/35-all 80 160 320 400 560 1000 -
walking 12.16 22.65 38.65 45.24 52.72 63.05 -
eating 8.39 17.05 33.03 40.44 52.54 77.11 -
smoking 8.02 16.27 31.32 38.15 49.45 71.64 -
discussion 11.98 26.76 57.08 69.74 88.59 117.59 -
directions 8.61 19.65 43.28 53.82 71.18 100.59 -
greeting 16.48 36.95 77.32 93.38 116.24 147.23 -
phoning 10.10 20.74 41.51 51.26 68.28 104.36 -
posing 12.79 29.38 66.95 85.01 116.26 174.33 -
purchases 14.75 32.39 66.13 79.63 101.63 139.15 -
sitting 10.53 21.99 46.26 57.80 78.19 120.02 -
sittingdown 16.10 31.63 62.45 76.84 102.83 155.45 -
takingphoto 9.89 21.01 44.56 56.30 77.94 121.87 -
waiting 10.68 23.06 48.25 59.23 76.33 106.25 -
walkingdog 20.65 42.88 80.35 93.31 111.87 148.21 -
walkingtogether 10.56 20.92 37.40 43.85 52.93 65.91 -
Average 12.11 25.56 51.64 62.93 81.13 114.18 57.93

CMU-10/25/35-all 80 160 320 400 560 1000 -
basketball 10.24 18.64 36.94 45.96 61.12 86.24 -
basketball_signal 3.04 5.62 12.49 16.60 25.43 49.99 -
directing_traffic 6.13 12.60 29.37 39.22 60.46 114.56 -
jumping 15.19 28.85 55.97 69.11 92.38 126.16 -
running 13.17 20.91 29.88 33.37 38.26 43.62 -
soccer 10.92 19.40 37.41 47.00 65.25 101.85 -
walking 6.38 10.25 16.88 20.05 25.48 36.78 -
washwindow 5.41 10.93 24.51 31.79 45.13 70.16 -
Average 8.81 15.90 30.43 37.89 51.69 78.67 37.23

Train

  • train on Human3.6M: python main.py --expname=h36m --is_train=1 --output_n=25 --dct_n=35 --test_manner=all
  • train on CMU Mocap: python main.py --expname=cmu --is_train=1 --output_n=25 --dct_n=35 --test_manner=all

Citation

If you use our code, please cite our work

@InProceedings{Dang_2021_ICCV,
    author    = {Dang, Lingwei and Nie, Yongwei and Long, Chengjiang and Zhang, Qing and Li, Guiqing},
    title     = {MSR-GCN: Multi-Scale Residual Graph Convolution Networks for Human Motion Prediction},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {11467-11476}
}

Acknowledgments

Some of our evaluation code and data process code was adapted/ported from LearnTrajDep by Wei Mao.

Licence

MIT

Owner
LevonDang
Pursuing the M.E. degree with the School of Computer Science and Engineering, South China University of Technology, 2020-.
LevonDang
We present a regularized self-labeling approach to improve the generalization and robustness properties of fine-tuning.

Overview This repository provides the implementation for the paper "Improved Regularization and Robustness for Fine-tuning in Neural Networks", which

NEU-StatsML-Research 21 Sep 08, 2022
Implementation of 'lightweight' GAN, proposed in ICLR 2021, in Pytorch. High resolution image generations that can be trained within a day or two

512x512 flowers after 12 hours of training, 1 gpu 256x256 flowers after 12 hours of training, 1 gpu Pizza 'Lightweight' GAN Implementation of 'lightwe

Phil Wang 1.5k Jan 02, 2023
Official implementation of MSR-GCN (ICCV 2021 paper)

MSR-GCN Official implementation of MSR-GCN: Multi-Scale Residual Graph Convolution Networks for Human Motion Prediction (ICCV 2021 paper) [Paper] [Sup

LevonDang 42 Nov 07, 2022
LightLog is an open source deep learning based lightweight log analysis tool for log anomaly detection.

LightLog Introduction LightLog is an open source deep learning based lightweight log analysis tool for log anomaly detection. Function description [BG

25 Dec 17, 2022
Change Detection in SAR Images Based on Multiscale Capsule Network

SAR_CD_MS_CapsNet Code for the paper "Change Detection in SAR Images Based on Multiscale Capsule Network" , IEEE Geoscience and Remote Sensing Letters

Feng Gao 21 Nov 29, 2022
BridgeGAN - Tensorflow implementation of Bridging the Gap between Label- and Reference-based Synthesis in Multi-attribute Image-to-Image Translation.

Bridging the Gap between Label- and Reference based Synthesis(ICCV 2021) Tensorflow implementation of Bridging the Gap between Label- and Reference-ba

huangqiusheng 8 Jul 13, 2022
Research on Event Accumulator Settings for Event-Based SLAM

Research on Event Accumulator Settings for Event-Based SLAM This is the source code for paper "Research on Event Accumulator Settings for Event-Based

Robin Shaun 26 Dec 21, 2022
Multi-Agent Reinforcement Learning (MARL) method to learn scalable control polices for multi-agent target tracking.

scalableMARL Scalable Reinforcement Learning Policies for Multi-Agent Control CD. Hsu, H. Jeong, GJ. Pappas, P. Chaudhari. "Scalable Reinforcement Lea

Christopher Hsu 17 Nov 17, 2022
[ICCV 2021 Oral] SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer

This repository contains the source code for the paper SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer (ICCV 2021 Oral). The project page is here.

AllenXiang 65 Dec 26, 2022
Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with predefined topics from an unlabeled document corpus.

Lbl2Vec Lbl2Vec is an algorithm for unsupervised document classification and unsupervised document retrieval. It automatically generates jointly embed

sebis - TUM - Germany 61 Dec 20, 2022
Multiple style transfer via variational autoencoder

ST-VAE Multiple style transfer via variational autoencoder By Zhi-Song Liu, Vicky Kalogeiton and Marie-Paule Cani This repo only provides simple testi

13 Oct 29, 2022
Extremely easy multi instancing software for minecraft speedrunning.

Easy Multi Extremely easy multi/single instancing software for minecraft speedrunning. A couple of goals of this project: Setup multi in minutes No fi

Duncan 8 Jul 16, 2022
Automatic detection and classification of Covid severity degree in LUS (lung ultrasound) scans

Final-Project Final project in the Technion, Biomedical faculty, by Mor Ventura, Dekel Brav & Omri Magen. Subproject 1: Automatic Detection of LUS Cha

Mor Ventura 1 Dec 18, 2021
Robust Consistent Video Depth Estimation

[CVPR 2021] Robust Consistent Video Depth Estimation This repository contains Python and C++ implementation of Robust Consistent Video Depth, as descr

Facebook Research 213 Dec 17, 2022
Raster Vision is an open source Python framework for building computer vision models on satellite, aerial, and other large imagery sets

Raster Vision is an open source Python framework for building computer vision models on satellite, aerial, and other large imagery sets (including obl

Azavea 1.7k Dec 22, 2022
113 Nov 28, 2022
Novel Instances Mining with Pseudo-Margin Evaluation for Few-Shot Object Detection

Novel Instances Mining with Pseudo-Margin Evaluation for Few-Shot Object Detection (NimPme) The official implementation of Novel Instances Mining with

12 Sep 08, 2022
Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution

Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution Figure: Example visualization of the method and baseline as a

Oliver Hahn 16 Dec 23, 2022
Code for "Discovering Non-monotonic Autoregressive Orderings with Variational Inference" (paper and code updated from ICLR 2021)

Discovering Non-monotonic Autoregressive Orderings with Variational Inference Description This package contains the source code implementation of the

Xuanlin (Simon) Li 10 Dec 29, 2022
This repository is dedicated to developing and maintaining code for experiments with wide neural networks.

Wide-Networks This repository contains the code of various experiments on wide neural networks. In particular, we implement classes for abc-parameteri

Karl Hajjar 0 Nov 02, 2021