Auto-Encoding Score Distribution Regression for Action Quality Assessment

Last update: Nov 16, 2022

Related tags

Overview

DAE-AQA

It is an open source program reference to paper Auto-Encoding Score Distribution Regression for Action Quality Assessment.

1.Introduction

DAE is a model for action quality assessment(AQA). It takes both advantages of regression algorithms and label distribution learning (LDL). Specifically, it encodes videos into distributions and uses the reparameterization trick in variational auto-encoders (VAE) to sample scores, which establishes a more accurate mapping between video and score. It can be appled to many scenarios. e.g, judgment of accuracy of an operation or score estimation of an diving athlete’s performance.

2.Datasets

MTL-AQA dataset

MTL-AQA dataset was orignially presented in the paper What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment (CVPR 2019) [arXiv], where the authors provided the YouTube links of untrimmed long videos and the corresponding annotations at here. The processed MTL-AQA dataset(Frames) can be downloaded through the following links:

1.[Google Drive]

2.[Baidu Drive](Password:SEU1)

The whole data structure should be:

DAE_AQA
├── data
|  └── frames
|  └── info
...

JIGSAWS dataset

JIGSAWS dataset was presented in the paper Jhu-isi gesture and skill assessment working set (jigsaws): A surgical activity dataset for human motion modeling (MICCAI workshop 2014), where the raw videos could be downloaded at here. We're typographing this part of the code, and we'll release it soon. The whole data structure is same as MTL-AQA. The processed JIGSAWS dataset(Frames) can be downloaded through the following links:

1.[Google Drive]

2.[Baidu Drive](Password:SEU1)

3.Training

training DAE model:

$ python DAE.py --log_info=DAE --num_workers=16 --gpu=0 --train_batch_size=8 --test_batch_size=32 --num_epochs=100

training DAE-MT model:

$ python DAE_MT.py --log_info=DAE-MT --num_workers=16 --gpu=0 --train_batch_size=8 --test_batch_size=32 --num_epochs=100

All default parameters are set in config.py. Considering that the memory of video processing on GPU is quite large, we suggest using small batch for training.

4.Testing

We provided a pre-trained DAE-MT model weight with a correlation coefficient of 0.9449 on MTL-AQA test dataset. You can download it through the following links:

1.[Google Drive]

2.[Baidu Drive](Password:SEU1)

CONTACT US:

If you have any questiones or meet any bugs, please contact us!

E-mail: [email protected]

Auto-Encoding Score Distribution Regression for Action Quality Assessment

Related tags

Overview

DAE-AQA

1.Introduction

2.Datasets

MTL-AQA dataset

JIGSAWS dataset

3.Training

4.Testing

CONTACT US:

Owner

🏆 The 1st Place Submission to AICity Challenge 2021 Natural Language-Based Vehicle Retrieval Track (Alibaba-UTS submission)

Structural Constraints on Information Content in Human Brain States

Reading Group @mila-iqia on Computational Optimal Transport for Machine Learning Applications

Project page for our ICCV 2021 paper "The Way to my Heart is through Contrastive Learning"

[ICCV 2021 Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

This project deals with the detection of skin lesions within the ISICs dataset using YOLOv3 Object Detection with Darknet.

PyTorch implementation of Memory-based semantic segmentation for off-road unstructured natural environments.

C3D is a modified version of BVLC caffe to support 3D ConvNets.

A python package simulating the quasi-2D pseudospin-1/2 Gross-Pitaevskii equation with NVIDIA GPU acceleration.

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis

Parallel Latent Tree-Induction for Faster Sequence Encoding

Repo for the paper "DiLBERT: Cheap Embeddings for Disease Related Medical NLP"

This is an example of a reproducible modelling project

PaddlePaddle GAN library, including lots of interesting applications like First-Order motion transfer, wav2lip, picture repair, image editing, photo2cartoon, image style transfer, and so on.

Pytorch codes for "Self-supervised Multi-view Stereo via Effective Co-Segmentation and Data-Augmentation"

Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Repository for self-supervised landmark discovery

PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

Official Pytorch Code for the paper TransWeather

Pytorch implementation for ACMMM2021 paper "I2V-GAN: Unpaired Infrared-to-Visible Video Translation".