A Comparative Review of Recent Kinect-Based Action Recognition Algorithms (TIP2020, Matlab codes)

Related tags

Deep LearningHDG
Overview

A Comparative Review of Recent Kinect-Based Action Recognition Algorithms

This repo contains:

  • the HDG implementation (Matlab codes) for 'Analysis and Evaluation of Kinect-based Action Recognition Algorithms', and
  • provides the links (google drive) for downloading the algorithms evaluated in our TIP journal and
  • provides direct links (google drive) to download 5 smaller datasets for action recognition research.

1 Introduction

This repository contains the implementation of HDG presented in the following paper:

[1] Lei Wang, 2017. Analysis and Evaluation of Kinect-based Action Recognition Algorithms. Master's thesis. School of Computer Science and Software Engineering, The University of Western Australia. [ArXiv] [BibTex]

[2] Lei Wang, Du Q. Huynh, and Piotr Koniusz. A Comparative Review of Recent Kinect-Based Action Recognition Algorithms. IEEE Transactions on Image Processing, 29: 15-28, 2020. [ArXiv] [BibTex]

We also provide the links for downloading the algorithms/datasets used in our TIP paper.

2 Other algorithms compared in TIP paper

You can download other algorithms we evaluated in TIP paper from the following links:

3 Datasets used in TIP paper

3.1 Five Smaller datasets

3.1.1 Depth+Skeleton

You can directly download the depth+skeleton sequences for the following smaller datasets here:

The above 5 downloaded datasets contain depth + skeleton data, which you can directly use for HDG algorithm in this repo:

  • unzip a dataset, and
  • put the Dataset folder into HDG folder, then
  • extract the features (refer to following sections for more details).

3.1.2 Depth video only

For downloading the UWA3DActivity+UWA3D Multiview Activity II depth only, you can use this link(extraction code: 172h).

For downloading the CAD-60 depth only, please use this link (extraction code: 36wt)

3.2 Big datasets (NTU RGB+D)

For big datasets such as NTU-60 and NTU-120, please refer to this link for the request to download.

4 Run the codes of HDG

This is an implementation based on Rahmani et al.’s paper ‘Real Time Action Recognition Using Histograms of Depth Gradients and Random Decision Forests’ (WACV2014).

To run our new HDG algorithm (which is analysed and compared in our TIP2020 paper):

4.0 A glance of skeleton configuration

To know more detailed information about the skeleton configuration/graph, please refer to the pdf file attached in this repo.

UWAS denotes the skeleton configuration for UWA3D Activity, and UWAW is for UWA3D Multiview Activity II.

4.1 Data preparation

  • Go to the 'Dataset' folder, then go to the 'depth' folder and copy all depth sequence in this folder (should be .mat format and the internal data has the same name 'inDepthVideo').

  • After that go to the 'skeleton' folder, copy all skeleton sequence (the skeleton sequence should also be .mat format and each skeleton sequence has the following dimension: #jointsx3x#frames, here 3 represents x, y and d respectively), the internal data has the same name 'skeletonsequence'.

4.2 Feature extraction and concatenation

  • Go to the 'MATLAB_Codes' folder, run each 'main' in each algorithm folder(in the order of 00, 01, 02 and 03), and then run 'main' in 'feature_concatenating'. You can also run '02' and '03' first and then run '00' and '01', since '00' may need more time for segmenting the foreground (around 6 hours) and '01' is based on the results of '00'.

  • For UWAMultiview dataset, remember to change the video sequence from uint16 to double using im2double before running each main in 00 and 01: in both 00 and 01 folders, in main function line 33 & 17, change depthsequence=actionvolume; to depthsequence=im2double(actionvolume);.

  • For feature concatenating, you can select different combinations of features for classification. There are four features, which are:

    • hod(histogram of depth),
    • hodg(histogram of depth gradients),
    • jmv(joint movement volume features) and
    • jpd(joint position differences features).
  • Remember to change the number of joints and the torso joint ID in the 'main' of '02' and '03' since different datasets have different number of joints and torso joint IDs (refer to the pdf attached in this repo for the skeleton configuration).

    • MSRPairs (3D Action Pairs): 20 joints, torso joint ID is '2';
    • MSRAction3D: 20 joints, torso joint ID is '4';
    • CAD-60: 15 joints, torso joint ID is '3';
    • UWA3D single view dataset (UWA3D Activity): 15 joints, torso joint ID is '9';
    • UWA3D multi view dataset (UWA3D Multiview Activity II): 15 joints, torso joint ID is '3';

4.3 Classification

  • Run 'main' of random decision forests (Lei uses different 'main' for different datasets since different datasets should have different training and testing datasets). In Lei's implementation, half of data are used for training and the remaining half for testing.

    • MSRPairs (3D Action Pairs): msrpairsmain.m
    • MSRAction3D: msr3dmain.m
    • CAD-60: cadmain.m
    • UWA3D single view (UWA3D Activity): uwasinglemain.m
    • UWA3D multi view (UWA3D Multiview Activity II): uwamultimain.m

4.4 Visualization (i.e., confusion matrix)

  • The results of the confusion matrix will be saved in the 'Results' folder, and the confusion matrix will be displayed. Moreover, the total accuracy will appear in the workspace of the MATLAB.

4.4.1 Save figures to pdf format

  • saveTightFigure function is downloaded from online resource, which can be used to save the confusion matrix plot as pdf files. The use of this function is, for example: saveTightFigure(gcf, 'uwamultiview.pdf');

Codes for parameters evaluation, and running over all possible combinations of selecting half subjects (for training) are not provided in this repo.

For more information, please refer to my research report and our journal paper, or contact me.

5 Citations

You can cite the following papers for the use of this work:

@mastersthesis{lei_thesis_2017,
  author       = {Lei Wang}, 
  title        = {Analysis and Evaluation of {K}inect-based Action Recognition Algorithms},
  school       = {School of the Computer Science and Software Engineering, The University of Western Australia},
  year         = 2017,
  month        = {Nov}
}
@article{lei_tip_2019,
author={Lei Wang and Du Q. Huynh and Piotr Koniusz},
journal={IEEE Transactions on Image Processing},
title={A Comparative Review of Recent Kinect-Based Action Recognition Algorithms},
year={2020},
volume={29},
number={},
pages={15-28},
doi={10.1109/TIP.2019.2925285},
ISSN={1941-0042},
month={},}

Acknowledgments

I am grateful to Associate Professor Du Huynh for her valuable suggestions and discussions. We would like to thank the authors of HON4D, HOPC, LARP-SO, HPM+TM, IndRNN and ST-GCN for making their codes publicly available. We thank the ROSE Lab of Nanyang Technological University(NTU), Singapore, for making the NTU RGB+D dataset freely accessible.

Owner
Lei Wang
PhD student, Machine Learning/Computer Vision Researcher
Lei Wang
DeepDiffusion: Unsupervised Learning of Retrieval-adapted Representations via Diffusion-based Ranking on Latent Feature Manifold

DeepDiffusion Introduction This repository provides the code of the DeepDiffusion algorithm for unsupervised learning of retrieval-adapted representat

4 Nov 15, 2022
3D Pose Estimation for Vehicles

3D Pose Estimation for Vehicles Introduction This work generates 4 key-points and 2 key-edges from vertices and edges of vehicles as ground truth. The

Jingyi Wang 1 Nov 01, 2021
Funnels: Exact maximum likelihood with dimensionality reduction.

Funnels This repository contains the code needed to reproduce the experiments from the paper: Funnels: Exact maximum likelihood with dimensionality re

2 Apr 21, 2022
A pyparsing-based library for parsing SOQL statements

CONTRIBUTORS WANTED!! Installation pip install python-soql-parser or, with poetry poetry add python-soql-parser Usage from python_soql_parser import p

Kicksaw 0 Jun 07, 2022
基于tensorflow 2.x的图片识别工具集

Classification.tf2 基于tensorflow 2.x的图片识别工具集 功能 粗粒度场景图片分类 细粒度场景图片分类 其他场景图片分类 模型部署 tensorflow serving本地推理和docker部署 tensorRT onnx ... 数据集 https://hyper.a

Wei Qi 1 Nov 03, 2021
implementation of the paper "MarginGAN: Adversarial Training in Semi-Supervised Learning"

MarginGAN This repository is the implementation of the paper "MarginGAN: Adversarial Training in Semi-Supervised Learning". 1."preliminary" is the imp

Van 7 Dec 23, 2022
Supervised & unsupervised machine-learning techniques are applied to the database of weighted P4s which admit Calabi-Yau hypersurfaces.

Weighted Projective Spaces ML Description: The database of 5-vectors describing 4d weighted projective spaces which admit Calabi-Yau hypersurfaces are

Ed Hirst 3 Sep 08, 2022
Cooperative multi-agent reinforcement learning for high-dimensional nonequilibrium control

Cooperative multi-agent reinforcement learning for high-dimensional nonequilibrium control Official implementation of: Cooperative multi-agent reinfor

0 Nov 16, 2021
Pytorch for Segmentation

Pytorch for Semantic Segmentation This repo has been deprecated currently and I will not maintain it. Meanwhile, I strongly recommend you can refer to

ycszen 411 Nov 22, 2022
Code release for Local Light Field Fusion at SIGGRAPH 2019

Local Light Field Fusion Project | Video | Paper Tensorflow implementation for novel view synthesis from sparse input images. Local Light Field Fusion

1.1k Dec 27, 2022
TrackTech: Real-time tracking of subjects and objects on multiple cameras

TrackTech: Real-time tracking of subjects and objects on multiple cameras This project is part of the 2021 spring bachelor final project of the Bachel

5 Jun 17, 2022
Minimal implementation and experiments of "No-Transaction Band Network: A Neural Network Architecture for Efficient Deep Hedging".

No-Transaction Band Network: A Neural Network Architecture for Efficient Deep Hedging Minimal implementation and experiments of "No-Transaction Band N

19 Jan 03, 2023
MiniSom is a minimalistic implementation of the Self Organizing Maps

MiniSom Self Organizing Maps MiniSom is a minimalistic and Numpy based implementation of the Self Organizing Maps (SOM). SOM is a type of Artificial N

Giuseppe Vettigli 1.2k Jan 03, 2023
Multiple Object Tracking with Yolov5!

Tracking with yolov5 This implementation is for who need to tracking multi-object only with detector. You can easily track mult-object with your well

9 Nov 08, 2022
Robust Partial Matching for Person Search in the Wild

APNet for Person Search Introduction This is the code of Robust Partial Matching for Person Search in the Wild accepted in CVPR2020. The Align-to-Part

Yingji Zhong 36 Dec 18, 2022
Iowa Project - My second project done at General Assembly, focused on feature engineering and understanding Linear Regression as a concept

Project 2 - Ames Housing Data and Kaggle Challenge PROBLEM STATEMENT Inferring or Predicting? What's more valuable for a housing model? When creating

Adam Muhammad Klesc 1 Jan 03, 2022
The official implementation of Equalization Loss v1 & v2 (CVPR 2020, 2021) based on MMDetection.

The Equalization Losses for Long-tailed Object Detection and Instance Segmentation This repo is official implementation CVPR 2021 paper: Equalization

Jingru Tan 129 Dec 16, 2022
This repository includes code of my study about Asynchronous in Frequency domain of GAN images.

Exploring the Asynchronous of the Frequency Spectra of GAN-generated Facial Images Binh M. Le & Simon S. Woo, "Exploring the Asynchronous of the Frequ

4 Aug 06, 2022
Official Pytorch implementation of paper "Reverse Engineering of Generative Models: Inferring Model Hyperparameters from Generated Images"

Reverse_Engineering_GMs Official Pytorch implementation of paper "Reverse Engineering of Generative Models: Inferring Model Hyperparameters from Gener

100 Dec 18, 2022
A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

WaveGlow A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis Quick Start: Install requirements: pip install

Yuchao Zhang 204 Jul 14, 2022