Implementation of our paper "Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning".

Related tags

Deep LearningPRP
Overview

PRP

Introduction

This is the implementation of our paper "Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning".

Getting started

  • Install

    Our experiments run on Python 3.6.1 and PyTorch 0.4.1. All dependencies can be installed using pip:

    python -m pip install -r requirements.txt
  • Data preparation

    We construct experiments on UCF101 and HMDB51 (the split1 of UCF101 for pre-training and the rest for fine-tuning). The expected dataset directory hierarchy is as follow:

    ├── UCF101/HMDB51
    │   ├── split
    │   │   ├── classInd.txt
    │   │   ├── testlist01.txt
    │   │   ├── trainlist01.txt
    │   │   └── ...
    │   └── video
    │       ├── ApplyEyeMakeup
    │       │   └── *.avi
    │       └── ...
    └── ...
    
  • Train and Test Pre-training on Pretext Task

    python train_predict.py --gpu 0 --epoch 300 --model_name c3d/r21d/r3d

    Action Recognition

    python ft_classfy.py --gpu 0 --model_name c3d/r21d/r3d --pre_path [your pre-trained model] --split 1/2/3
    python test_classify.py

    Video Retrieval

    Please refer to the code video_retrieval_samples.py of VCOP.

Model zoo

  • Models

    Pre-trained PRP model on the split1 of UCF101: C3D(OneDrive); R3D(OneDrive); R(2+1)D(OneDrive)

  • Action Recognition Results

    Architecture UCF101(%) HMDB51(%)
    C3D 69.1 34.5
    R3D 66.5 29.7
    R(2+1)D 72.1 35.0

License

This project is released under the Apache 2.0 license.

Citation

Please cite the following paper if you feel RSPNet useful to your research

@InProceedings{Yao_2020_CVPR,  
author = {Yao, Yuan and Liu, Chang and Luo, Dezhao and Zhou, Yu and Ye, Qixiang},  
title = {Video Playback Rate Perception for Self-Supervised Spatio-Temporal Representation Learning},  
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},  
month = {June},  
year = {2020}  
}
Owner
yuanyao366
yuanyao366
PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

PyExplainer PyExplainer is a local rule-based model-agnostic technique for generating explanations (i.e., why a commit is predicted as defective) of J

AI Wizards for Software Management (AWSM) Research Group 14 Nov 13, 2022
Huawei Hackathon 2021 - Sweden (Stockholm)

huawei-hackathon-2021 Contributors DrakeAxelrod Challenge Requirements: python=3.8.10 Standard libraries (no importing) Important factors: Data depend

Drake Axelrod 32 Nov 08, 2022
Rule Based Classification Project

Kural Tabanlı Sınıflandırma ile Potansiyel Müşteri Getirisi Hesaplama İş Problemi: Bir oyun şirketi müşterilerinin bazı özelliklerini kullanaraknseviy

Şafak 1 Jan 12, 2022
LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

LiDAR Distillation Paper | Model LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection Yi Wei, Zibu Wei, Yongming Rao, Jiax

Yi Wei 75 Dec 22, 2022
Python scripts for performing road segemtnation and car detection using the HybridNets multitask model in ONNX.

ONNX-HybridNets-Multitask-Road-Detection Python scripts for performing road segemtnation and car detection using the HybridNets multitask model in ONN

Ibai Gorordo 45 Jan 01, 2023
Repo for the Video Person Clustering dataset, and code for the associated paper

Video Person Clustering Repo for the Video Person Clustering dataset, and code for the associated paper. This reporsitory contains the Video Person Cl

Andrew Brown 47 Nov 02, 2022
Little tool in python to watch anime from the terminal (the better way to watch anime)

ani-cli Script working again :), thanks to the fork by Dink4n for the alternative approach to by pass the captcha on gogoanime A cli to browse and wat

Harshith 4.5k Dec 31, 2022
Simulating an AI playing 2048 using the Expectimax algorithm

2048-expectimax Simulating an AI playing 2048 using the Expectimax algorithm The base game engine uses code from here. The AI player is modeled as a m

Subha Ramesh 2 Jan 31, 2022
Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving

Visual 3D Detection Package: This repo aims to provide flexible and reproducible visual 3D detection on KITTI dataset. We expect scripts starting from

Yuxuan Liu 305 Dec 19, 2022
DCGAN-tensorflow - A tensorflow implementation of Deep Convolutional Generative Adversarial Networks

DCGAN in Tensorflow Tensorflow implementation of Deep Convolutional Generative Adversarial Networks which is a stabilize Generative Adversarial Networ

Taehoon Kim 7.1k Dec 29, 2022
Pyramid Pooling Transformer for Scene Understanding

Pyramid Pooling Transformer for Scene Understanding Requirements: torch 1.6+ torchvision 0.7.0 timm==0.3.2 Validated on torch 1.6.0, torchvision 0.7.0

Yu-Huan Wu 119 Dec 29, 2022
[ICCV 2021] HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration

HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration Introduction The repository contains the source code and pre-tr

Intelligent Sensing, Perception and Computing Group 55 Dec 14, 2022
Speckle-free Holography with Partially Coherent Light Sources and Camera-in-the-loop Calibration

Speckle-free Holography with Partially Coherent Light Sources and Camera-in-the-loop Calibration Project Page | Paper Yifan Peng*, Suyeon Choi*, Jongh

Stanford Computational Imaging Lab 19 Dec 11, 2022
Machine Learning Time-Series Platform

cesium: Open-Source Platform for Time Series Inference Summary cesium is an open source library that allows users to: extract features from raw time s

632 Dec 26, 2022
Geometric Deep Learning Extension Library for PyTorch

Documentation | Paper | Colab Notebooks | External Resources | OGB Examples PyTorch Geometric (PyG) is a geometric deep learning extension library for

Matthias Fey 16.5k Jan 08, 2023
StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation Demo video: CVPR 2021 Oral: Single Channel Manipulation: Localized or attribu

Zongze Wu 267 Dec 30, 2022
OpenVisionAPI server

🚀 Quick start An instance of ova-server is free and publicly available here: https://api.openvisionapi.com Checkout ova-client for a quick demo. Inst

Open Vision API 93 Nov 24, 2022
PyTorch implementation of DeepDream algorithm

neural-dream This is a PyTorch implementation of DeepDream. The code is based on neural-style-pt. Here we DeepDream a photograph of the Golden Gate Br

121 Nov 05, 2022
Sarus implementation of classical ML models. The models are implemented using the Keras API of tensorflow 2. Vizualization are implemented and can be seen in tensorboard.

Sarus published models Sarus implementation of classical ML models. The models are implemented using the Keras API of tensorflow 2. Vizualization are

Sarus Technologies 39 Aug 19, 2022
[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.

DeepVecFont This is the homepage for "DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning". Yizhi Wang and Zhouhui Lian. WI

Yizhi Wang 17 Dec 22, 2022