PyVideoAI: Action Recognition Framework

Overview

This reposity contains official implementation of:

PyVideoAI: Action Recognition Framework

The only framework that completes your computer vision, action recognition research environment.

** Key features **

  • Supports multi-gpu, multi-node training.
  • STOA models such as I3D, Non-local, TSN, TRN, TSM, MVFNet, ..., and even ImageNet training!
  • Many datasets such as Kinetics-400, EPIC-Kitchens-55, Something-Something-V1/V2, HMDB-51, UCF-101, Diving48, CATER, ...
  • Supports both video decoding (straight from .avi/mp4) and frame extracted (.jpg/png) dataloaders, sparse-sample and dense-sample.
  • Any popular LR scheduling like Cosine Annealing with Warm Restart, Step LR, and Reduce LR on Plateau.
  • Early stopping when training doesn't improve (customise your condition)
  • Easily add custom model, optimiser, scheduler, loss and dataloader!
  • Telegram bot reporting experiment status.
  • TensorBoard reporting stats.
  • Colour logging
  • All of the above come with no extra setup. Trust me and try some examples.

** Papers implemented **

This package is motivated by PySlowFast from Facebook AI. The PySlowFast is a cool framework, but it depends too much on their config system and it was difficult to add new models (other codes) or reuse part of the modules from the framework.
This framework by Kiyoon, is designed to replace all the configuration systems to Python files, which enables easy-addition of custom models/LR scheduling/dataloader etc.
Just modify the function bodies in the config files!

Difference between the two config systems can be found in CONFIG_SYSTEM.md.

Getting Started

Jupyter Notebook examples to run:

  • HMDB-51 data preparation
  • Inference on pre-trained model from the model zoo, and visualise model/dataloader/per-class performance.
  • Training I3D using Kinetics pretrained model
  • Using image model and ImageNet dataset

is provided in the examples!

Structure

All of the executable files are in tools/.
dataset_configs/ directory configures datasets. For example, where is the dataset stored, number of classes, single-label or multi-label training, dataset-specific visualisation settings (confusion matrix has different output sizes)
model_configs/ directory configures model architectures. For example, model definition, input preprocessing mean/std.
exp_configs/ directory configures other training settings like optimiser, scheduling, dataloader, number of frames as input. The config file path has to be in exp_configs/[dataset_name]/[model_name]_[experiment_name].py format.

Usage

Preparing datasets

This package supports many action recognition datasets such as HMDB-51, EPIC-Kitchens-55, Something-Something-V1, CATER, etc.
Refer to DATASET.md.

Training command

CUDA_VISIBLE_DEVICES=0,1,2,3 python tools/run_train.py -D {dataset_config_name} -M {model_config_name} -E {exp_config_name} --local_world_size {num_GPUs} -e {num_epochs}

--local_world_size denotes the number of GPUs per computing node.

Telegram Bot

You can preview experiment results using Telegram bots!
Telegram bot stat report example

If your code raises an exception, it will report you too.
Telegram error report example

You can quickly take a look at example video inputs (as GIF or JPEGs) from the dataloader.
Use tools/visualisations/model_and_dataloader_visualiser.py
Telegram video input report example

[Telegram0]
token=
chat_id=

Model Zoo and Baselines

Refer to MODEL_ZOO.md

Installation

Refer to INSTALL.md.

TL;DR,

conda create -n videoai python=3.8
conda activate videoai
conda install pytorch==1.9.1 torchvision==0.10.1 cudatoolkit=10.2 -c pytorch
### For RTX 30xx GPUs,
#conda install pytorch==1.9.1 torchvision==0.10.1 cudatoolkit=11.1 -c pytorch -c nvidia
 

git clone --recurse-submodules https://github.com/kiyoon/PyVideoAI.git
cd PyVideoAI
git checkout v0.3
git submodule update --recursive
cd submodules/video_datasets_api
pip install -e .
cd ../experiment_utils
pip install -e .
cd ../..
pip install -e .

Experiment outputs

The experiment results (log, training stats, weights, tensorboard, plots, etc.) are saved to data/experiments by default. This can be huge, so make sure you make a softlink of a directory you really want to use. (recommended)

Otherwise, you can change pyvideoai/config.py's DEFAULT_EXPERIMENT_ROOT value. Or, you can also set --experiment_root argument manually when executing.

Owner
Kiyoon Kim
Computer scientist with computer vision, machine learning and signal processing background.
Kiyoon Kim
This repository collects 100 papers related to negative sampling methods.

Negative-Sampling-Paper This repository collects 100 papers related to negative sampling methods, covering multiple research fields such as Recommenda

RUCAIBox 119 Dec 29, 2022
[ICLR 2021] "CPT: Efficient Deep Neural Network Training via Cyclic Precision" by Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin

CPT: Efficient Deep Neural Network Training via Cyclic Precision Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin Accep

26 Oct 25, 2022
This is the official github repository of the Met dataset

The Met dataset This is the official github repository of the Met dataset. The official webpage of the dataset can be found here. What is it? This cod

Nikolaos-Antonios Ypsilantis 35 Dec 17, 2022
PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more

PyTorch Image Models Sponsors What's New Introduction Models Features Results Getting Started (Documentation) Train, Validation, Inference Scripts Awe

Ross Wightman 22.9k Jan 09, 2023
A tutorial showing how to train, convert, and run TensorFlow Lite object detection models on Android devices, the Raspberry Pi, and more!

A tutorial showing how to train, convert, and run TensorFlow Lite object detection models on Android devices, the Raspberry Pi, and more!

Evan 1.3k Jan 02, 2023
Deep learning model, heat map, data prepo

deep learning model, heat map, data prepo

Pamela Dekas 1 Jan 14, 2022
We propose a new method for effective shadow removal by regarding it as an exposure fusion problem.

Auto-exposure fusion for single-image shadow removal We propose a new method for effective shadow removal by regarding it as an exposure fusion proble

Qing Guo 146 Dec 31, 2022
Teaches a student network from the knowledge obtained via training of a larger teacher network

Distilling-the-knowledge-in-neural-network Teaches a student network from the knowledge obtained via training of a larger teacher network This is an i

Abhishek Sinha 146 Dec 11, 2022
MDETR: Modulated Detection for End-to-End Multi-Modal Understanding

MDETR: Modulated Detection for End-to-End Multi-Modal Understanding Website • Colab • Paper This repository contains code and links to pre-trained mod

Aishwarya Kamath 770 Dec 28, 2022
基于DouZero定制AI实战欢乐斗地主

DouZero_For_Happy_DouDiZhu: 将DouZero用于欢乐斗地主实战 本项目基于DouZero 环境配置请移步项目DouZero 模型默认为WP,更换模型请修改start.py中的模型路径 运行main.py即可 SL (baselines/sl/): 基于人类数据进行深度学习

1.5k Jan 08, 2023
Degree-Quant: Quantization-Aware Training for Graph Neural Networks.

Degree-Quant This repo provides a clean re-implementation of the code associated with the paper Degree-Quant: Quantization-Aware Training for Graph Ne

35 Oct 07, 2022
Official code for the ICCV 2021 paper "DECA: Deep viewpoint-Equivariant human pose estimation using Capsule Autoencoders"

DECA Official code for the ICCV 2021 paper "DECA: Deep viewpoint-Equivariant human pose estimation using Capsule Autoencoders". All the code is writte

23 Dec 01, 2022
Materials for my scikit-learn tutorial

Scikit-learn Tutorial Jake VanderPlas email: [email protected] twitter: @jakevdp gith

Jake Vanderplas 1.6k Dec 30, 2022
Security evaluation module with onnx, pytorch, and SecML.

🚀 🐼 🔥 PandaVision Integrate and automate security evaluations with onnx, pytorch, and SecML! Installation Starting the server without Docker If you

Maura Pintor 11 Apr 12, 2022
Code for HodgeNet: Learning Spectral Geometry on Triangle Meshes, in SIGGRAPH 2021.

HodgeNet | Webpage | Paper | Video HodgeNet: Learning Spectral Geometry on Triangle Meshes Dmitriy Smirnov, Justin Solomon SIGGRAPH 2021 Set-up To ins

Dima Smirnov 61 Nov 27, 2022
PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

Thalles Silva 1.7k Dec 28, 2022
Motion Reconstruction Code and Data for Skills from Videos (SFV)

Motion Reconstruction Code and Data for Skills from Videos (SFV) This repo contains the data and the code for motion reconstruction component of the S

268 Dec 01, 2022
Outlier Exposure with Confidence Control for Out-of-Distribution Detection

OOD-detection-using-OECC This repository contains the essential code for the paper Outlier Exposure with Confidence Control for Out-of-Distribution De

Nazim Shaikh 64 Nov 02, 2022
AquaTimer - Programmable Timer for Aquariums based on ATtiny414/814/1614

AquaTimer - Programmable Timer for Aquariums based on ATtiny414/814/1614 AquaTimer is a programmable timer for 12V devices such as lighting, solenoid

Stefan Wagner 4 Jun 13, 2022
3D Generative Adversarial Network

Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling This repository contains pre-trained models and sampling

Chengkai Zhang 791 Dec 20, 2022