[ICCV 2021] Group-aware Contrastive Regression for Action Quality Assessment

Related tags

Deep LearningCoRe
Overview

CoRe

Created by Xumin Yu*, Yongming Rao*, Wenliang Zhao, Jiwen Lu, Jie Zhou

This is the PyTorch implementation for ICCV paper Group-aware Contrastive Regression for Action Quality Assessment arXiv.

We present a new Contrastive Regression (CoRe) framework to learn the relative scores by pair-wise comparison, which highlights the differences between videos and guides the models to learn the key hints for action quality assessment.

intro

Pretrained Model

Usage

Requirement

  • Python >= 3.6
  • Pytorch >= 1.4.0
  • torchvision >= 0.4.1
  • torch_videovision
pip install git+https://github.com/hassony2/torch_videovision

Download initial I3D

We use the Kinetics pretrained I3D model from the reposity kinetics_i3d_pytorch

Dataset Preparation

MTL-AQA

  • Please download the dataset from the repository MTL-AQA. The data structure should be:
$DATASET_ROOT
├── MTL-AQA/
    ├── new
        ├── new_total_frames_256s
            ├── 01
            ...
            └── 09
    ├── info
        ├── final_annotations_dict_with_dive_number
        ├── test_split_0.pkl
        └── train_split_0.pkl
    └── model_rgb.pth

The processed annotations are already provided in this repo. You can download the prepared dataset [BaiduYun](code:smff). Download and unzip the four zip files under MTL-AQA/, then follow the structure. If you want to prepare the data by yourself, please see MTL_helper for some helps. We provide codes for processing the data from an online video to the frames data.

AQA-7

  • Download AQA-7 Dataset:
mkdir AQA-Seven & cd AQA-Seven
wget http://rtis.oit.unlv.edu/datasets/AQA-7.zip
unzip AQA-7.zip

The data structure should be:

$DATASET_ROOT
├── Seven/
    ├── diving-out
        ├── 001
            ├── img_00001.jpg
            ...
        ...
        └── 370
    ├── gym_vault-out
        ├── 001
            ├── img_00001.jpg
            ...
    ...

    └── Split_4
        ├── split_4_test_list.mat
        └── split_4_train_list.mat

You can download he prepared dataset [BaiduYun](code:65rl). Unzip the file under Seven/

JIGSAWS

  • Please download the dataset from JIASAWS. You are required to complete a form before you use this dataset for academic research.

The training and test code for JIGSAWS is on the way.

Training and Evaluation

To train a CoRe model:

bash ./scripts/train.sh <GPUIDS>  <MTL/Seven> <exp_name>  [--resume] 

For example,

# train a model on MTL
bash ./scripts/train.sh 0,1 MTL try 

# train a model on Seven
bash ./scripts/train.sh 0,1 Seven try --Seven_cls 1

To evaluate a pretrained model:

bash ./scripts/test.sh <GPUIDS>  <MTL/Seven> <exp_name>  --ckpts <path> [--Seven_cls <int>]

For example,

# test a model on MTL
bash ./scripts/test.sh 0 MTL try --ckpts ./MTL_CoRe.pth

# test a model on Seven
bash ./scripts/test.sh 0 Seven try --Seven_cls 1 --ckpts ./Seven_CoRe_1.pth

Visualizatin Results

vis

Citation

If you find our work useful in your research, please consider citing:

@misc{yu2021groupaware,
      title={Group-aware Contrastive Regression for Action Quality Assessment}, 
      author={Xumin Yu and Yongming Rao and Wenliang Zhao and Jiwen Lu and Jie Zhou},
      year={2021},
      eprint={2108.07797},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Owner
Xumin Yu
Xumin Yu
Deep Learning pipeline for motor-imagery classification.

BCI-ToolBox 1. Introduction BCI-ToolBox is deep learning pipeline for motor-imagery classification. This repo contains five models: ShallowConvNet, De

DongHee 18 Oct 31, 2022
Charsiu: A transformer-based phonetic aligner

Charsiu: A transformer-based phonetic aligner [arXiv] Note. This is a preview version. The aligner is under active development. New functions, new lan

jzhu 166 Dec 09, 2022
Texture mapping with variational auto-encoders

vae-textures This is an experiment with using variational autoencoders (VAEs) to perform mesh parameterization. This was also my first project using J

Alex Nichol 41 May 24, 2022
RIFE - Real-Time Intermediate Flow Estimation for Video Frame Interpolation

RIFE - Real-Time Intermediate Flow Estimation for Video Frame Interpolation YouTube | BiliBili 16X interpolation results from two input images: Introd

旷视天元 MegEngine 28 Dec 09, 2022
Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)

MusCaps: Generating Captions for Music Audio Ilaria Manco1 2, Emmanouil Benetos1, Elio Quinton2, Gyorgy Fazekas1 1 Queen Mary University of London, 2

Ilaria Manco 57 Dec 07, 2022
Free Book about Deep-Learning approaches for Chess (like AlphaZero, Leela Chess Zero and Stockfish NNUE)

Free Book about Deep-Learning approaches for Chess (like AlphaZero, Leela Chess Zero and Stockfish NNUE)

Dominik Klein 189 Dec 21, 2022
Simple machine learning library / 簡單易用的機器學習套件

FukuML Simple machine learning library / 簡單易用的機器學習套件 Installation $ pip install FukuML Tutorial Lesson 1: Perceptron Binary Classification Learning Al

Fukuball Lin 279 Sep 15, 2022
NAACL'2021: Factual Probing Is [MASK]: Learning vs. Learning to Recall

OptiPrompt This is the PyTorch implementation of the paper Factual Probing Is [MASK]: Learning vs. Learning to Recall. We propose OptiPrompt, a simple

Princeton Natural Language Processing 150 Dec 20, 2022
Inflated i3d network with inception backbone, weights transfered from tensorflow

I3D models transfered from Tensorflow to PyTorch This repo contains several scripts that allow to transfer the weights from the tensorflow implementat

Yana 479 Dec 08, 2022
A Fast and Stable GAN for Small and High Resolution Imagesets - pytorch

A Fast and Stable GAN for Small and High Resolution Imagesets - pytorch The official pytorch implementation of the paper "Towards Faster and Stabilize

Bingchen Liu 455 Jan 08, 2023
Contrastive Language-Image Pretraining

CLIP [Blog] [Paper] [Model Card] [Colab] CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pair

OpenAI 11.5k Jan 08, 2023
Mall-Customers-Segmentation - Customer Segmentation Using K-Means Clustering

Overview Customer Segmentation is one the most important applications of unsupervised learning. Using clustering techniques, companies can identify th

NelakurthiSudheer 2 Jan 03, 2022
NaijaSenti is an open-source sentiment and emotion corpora for four major Nigerian languages

NaijaSenti is an open-source sentiment and emotion corpora for four major Nigerian languages. This project was supported by lacuna-fund initiatives. Jump straight to one of the sections below, or jus

Hausa Natural Language Processing 14 Dec 20, 2022
Dynamic vae - Dynamic VAE algorithm is used for anomaly detection of battery data

Dynamic VAE frame Automatic feature extraction can be achieved by probability di

10 Oct 07, 2022
PyTorch implementation of NeurIPS 2021 paper: "CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration"

PyTorch implementation of NeurIPS 2021 paper: "CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration"

76 Jan 03, 2023
A pytorch &keras implementation and demo of Fastformer.

Fastformer Notes from the authors Pytorch/Keras implementation of Fastformer. The keras version only includes the core fastformer attention part. The

153 Dec 28, 2022
Spectral Temporal Graph Neural Network (StemGNN in short) for Multivariate Time-series Forecasting

Spectral Temporal Graph Neural Network for Multivariate Time-series Forecasting This repository is the official implementation of Spectral Temporal Gr

Microsoft 306 Dec 29, 2022
SEAN: Image Synthesis with Semantic Region-Adaptive Normalization (CVPR 2020, Oral)

SEAN: Image Synthesis with Semantic Region-Adaptive Normalization (CVPR 2020 Oral) Figure: Face image editing controlled via style images and segmenta

Peihao Zhu 579 Dec 30, 2022
Various operations like path tracking, counting, etc by using yolov5

Object-tracing-with-YOLOv5 Various operations like path tracking, counting, etc by using yolov5

Pawan Valluri 5 Nov 28, 2022
Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging"

Deep Optics for Single-shot High-dynamic-range Imaging Code associated with the paper "Deep Optics for Single-shot High-dynamic-range Imaging" CVPR, 2

Stanford Computational Imaging Lab 40 Dec 12, 2022