Implementation of our paper "Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning".

Last update: Dec 29, 2022

Related tags

Deep Learning PRP

Overview

PRP

Introduction

This is the implementation of our paper "Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning".

Getting started

Install

Our experiments run on Python 3.6.1 and PyTorch 0.4.1. All dependencies can be installed using pip:
```
python -m pip install -r requirements.txt
```

Data preparation

We construct experiments on UCF101 and HMDB51 (the split1 of UCF101 for pre-training and the rest for fine-tuning). The expected dataset directory hierarchy is as follow:

├── UCF101/HMDB51
│   ├── split
│   │   ├── classInd.txt
│   │   ├── testlist01.txt
│   │   ├── trainlist01.txt
│   │   └── ...
│   └── video
│       ├── ApplyEyeMakeup
│       │   └── *.avi
│       └── ...
└── ...

Train and Test Pre-training on Pretext Task

python train_predict.py --gpu 0 --epoch 300 --model_name c3d/r21d/r3d

Action Recognition

python ft_classfy.py --gpu 0 --model_name c3d/r21d/r3d --pre_path [your pre-trained model] --split 1/2/3
python test_classify.py

Video Retrieval

Please refer to the code video_retrieval_samples.py of VCOP.

Model zoo

Models

Pre-trained PRP model on the split1 of UCF101: C3D(OneDrive); R3D(OneDrive); R(2+1)D(OneDrive)
Action Recognition Results

Architecture UCF101(%) HMDB51(%)

C3D 69.1 34.5

R3D 66.5 29.7

R(2+1)D 72.1 35.0

Architecture	UCF101(%)	HMDB51(%)
C3D	69.1	34.5
R3D	66.5	29.7
R(2+1)D	72.1	35.0

License

This project is released under the Apache 2.0 license.

Citation

Please cite the following paper if you feel RSPNet useful to your research

@InProceedings{Yao_2020_CVPR,  
author = {Yao, Yuan and Liu, Chang and Luo, Dezhao and Zhou, Yu and Ye, Qixiang},  
title = {Video Playback Rate Perception for Self-Supervised Spatio-Temporal Representation Learning},  
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},  
month = {June},  
year = {2020}  
}

Implementation of our paper "Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning".

Related tags

Overview

PRP

Introduction

Getting started

Model zoo

License

Citation

Owner

yuanyao366

PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

Huawei Hackathon 2021 - Sweden (Stockholm)

Rule Based Classification Project

LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

Python scripts for performing road segemtnation and car detection using the HybridNets multitask model in ONNX.

Repo for the Video Person Clustering dataset, and code for the associated paper

Little tool in python to watch anime from the terminal (the better way to watch anime)

Simulating an AI playing 2048 using the Expectimax algorithm

Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

DCGAN-tensorflow - A tensorflow implementation of Deep Convolutional Generative Adversarial Networks

Pyramid Pooling Transformer for Scene Understanding

[ICCV 2021] HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration

Speckle-free Holography with Partially Coherent Light Sources and Camera-in-the-loop Calibration

Machine Learning Time-Series Platform

Geometric Deep Learning Extension Library for PyTorch

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

OpenVisionAPI server

PyTorch implementation of DeepDream algorithm

Sarus implementation of classical ML models. The models are implemented using the Keras API of tensorflow 2. Vizualization are implemented and can be seen in tensorboard.

[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.