3D HourGlass Networks for Human Pose Estimation Through Videos

Last update: Jan 02, 2023

Overview

3D-HourGlass-Network

3D CNN Based Hourglass Network for Human Pose Estimation (3D Human Pose) from videos. This was my summer'18 research project.

Discussion

In this work I try to extend the idea in Carriera et. al. CVPR'17 of 3D CNN inflation for action recognition from videos to human pose estimation from videos. We use a pretrained hourglass network with a fully connected depth regressor, inflate the 2D convolutions to 3D convolutions and perform temporal 3D human pose estimation. This inflation helps the network learn features from nearby frames and refine its predictions. Similar idea was used in Girdhar et. al. CVPR'18 (at about the same time!) where they perform multiperson human pose estimartion from videos using an inflated Mask RCNN

Requirements

python 3.6
pytorch 0.4
torchvision
progress

Datasets

We used Human 3.6 dataset for this project.

Instructions to run

python main.py -expID [EXP-NAME] -nFramesReg [NUM-FRAMES]

Results

We improved the baseline performance of hourglass network from MPJPE of 64 to MPJPE 62.8 and thus show significance of temporal features in real world problems. This idea could be easily extended for other tasks also like semantic segmentation and object detection.

3D HourGlass Networks for Human Pose Estimation Through Videos

Related tags

Overview

3D-HourGlass-Network

Discussion

Requirements

Datasets

Instructions to run

Results

Owner

Naman Jain

Source codes of CenterTrack++ in 2021 ICME Workshop on Big Surveillance Data Processing and Analysis

Setup freqtrade/freqUI on Heroku

Code + pre-trained models for the paper Keeping Your Eye on the Ball Trajectory Attention in Video Transformers

Neural Network to colorize grayscale images

Deep Learning for Natural Language Processing SS 2021 (TU Darmstadt)

Official code repository for the EMNLP 2021 paper

A large-scale database for graph representation learning

Resilient projection-based consensus actor-critic (RPBCAC) algorithm

Official implementation of the paper "Topographic VAEs learn Equivariant Capsules"

Code for BMVC2021 "MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation"

tf2-keras implement yolov5

(CVPR 2022 Oral) Official implementation for "Surface Representation for Point Clouds"

Elevation Mapping on GPU.

Using BERT+Bi-LSTM+CRF

Power Core Simulator!

Simulator for FRC 2022 challenge: Rapid React

Official Implementation and Dataset of "PPR10K: A Large-Scale Portrait Photo Retouching Dataset with Human-Region Mask and Group-Level Consistency", CVPR 2021

Simulation of self-focusing of laser beams in condensed media

ComputerVision - This repository aims at realized easy network architecture

DPC: Unsupervised Deep Point Correspondence via Cross and Self Construction (3DV 2021)