Official PyTorch Implementation of paper EAN: Event Adaptive Network for Efficient Action Recognition

Overview

EAN: Event Adaptive Network

PWC

PyTorch Implementation of paper:

EAN: Event Adaptive Network for Enhanced Action Recognition

Yuan Tian, Yichao Yan, Xiongkuo Min, Guo Lu, Guangtao Zhai, Guodong Guo, and Zhiyong Gao

[ArXiv]

Main Contribution

Efficiently modeling spatial-temporal information in videos is crucial for action recognition. In this paper, we propose a unified action recognition framework to investigate the dynamic nature of video content by introducing the following designs. First, when extracting local cues, we generate the spatial-temporal kernels of dynamic-scale to adaptively fit the diverse events. Second, to accurately aggregate these cues into a global video representation, we propose to mine the interactions only among a few selected foreground objects by a Transformer, which yields a sparse paradigm. We call the proposed framework as Event Adaptive Network (EAN) because both key designs are adaptive to the input video content. To exploit the short-term motions within local segments, we propose a novel and efficient Latent Motion Code (LMC) module, further improving the performance of the framework.

Content

Dependencies

Please make sure the following libraries are installed successfully:

Data Preparation

Following the common practice, we need to first extract videos into frames for fast data loading. Please refer to TSN repo for the detailed guide of data pre-processing. We have successfully trained on Something-Something-V1 and V2, Kinetics, Diving48 datasets with this codebase. Basically, the processing of video data can be summarized into 3 steps:

  1. Extract frames from videos:

  2. Generate file lists needed for dataloader:

    • Each line of the list file will contain a tuple of (extracted video frame folder name, video frame number, and video groundtruth class). A list file looks like this:

      video_frame_folder 100 10
      video_2_frame_folder 150 31
      ...
      
    • Or you can use off-the-shelf tools provided by the repos: data_process/gen_label_xxx.py

  3. Edit dataset config information in datasets_video.py

Pretrained Models

Here, we provide the pretrained models of EAN models on Something-Something-V1 datasets. Recognizing actions in this dataset requires strong temporal modeling ability. EAN achieves state-of-the-art performance on these datasets. Notably, our method even surpasses optical flow based methods while with only RGB frames as input.

Something-Something-V1

Model Backbone FLOPs Val Top1 Val Top5 Checkpoints
EAN8F(RGB+LMC) ResNet-50 37G 53.4 81.1 [Jianguo Cloud]
EAN16(RGB+LMC) 74G 54.7 82.3
EAN16+8(RGB+LMC) 111G 57.2 83.9
EAN2 x (16+8)(RGB+LMC) 222G 57.5 84.3

Testing

For example, to test the EAN models on Something-Something-V1, you can first put the downloaded .pth.tar files into the "pretrained" folder and then run:

# test EAN model with 8frames clip
bash scripts/test/sthv1/RGB_LMC_8F.sh

# test EAN model with 16frames clip
bash scripts/test/sthv1/RGB_LMC_16F.sh

Training

We provided several scripts to train EAN with this repo, please refer to "scripts" folder for more details. For example, to train PAN on Something-Something-V1, you can run:

# train EAN model with 8frames clip
bash scripts/train/sthv1/RGB_LMC_8F.sh

Notice that you should scale up the learning rate with batch size. For example, if you use a batch size of 32 you should set learning rate to 0.005.

Other Info

References

This repository is built upon the following baseline implementations for the action recognition task.

Citation

Please [★star] this repo and [cite] the following arXiv paper if you feel our EAN useful to your research:

@misc{tian2021ean,
      title={EAN: Event Adaptive Network for Enhanced Action Recognition}, 
      author={Yuan Tian and Yichao Yan and Xiongkuo Min and Guo Lu and Guangtao Zhai and Guodong Guo and Zhiyong Gao},
      year={2021},
      eprint={2107.10771},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Contact

For any questions, please feel free to open an issue or contact:

Yuan Tian: [email protected]
Owner
TianYuan
TianYuan
WTTE-RNN a framework for churn and time to event prediction

WTTE-RNN Weibull Time To Event Recurrent Neural Network A less hacky machine-learning framework for churn- and time to event prediction. Forecasting p

Egil Martinsson 727 Dec 28, 2022
Official pytorch implementation of the paper: "SinGAN: Learning a Generative Model from a Single Natural Image"

SinGAN Project | Arxiv | CVF | Supplementary materials | Talk (ICCV`19) Official pytorch implementation of the paper: "SinGAN: Learning a Generative M

Tamar Rott Shaham 3.2k Dec 25, 2022
Voxel-based Network for Shape Completion by Leveraging Edge Generation (ICCV 2021, oral)

Voxel-based Network for Shape Completion by Leveraging Edge Generation This is the PyTorch implementation for the paper "Voxel-based Network for Shape

10 Dec 04, 2022
BED: A Real-Time Object Detection System for Edge Devices

BED: A Real-Time Object Detection System for Edge Devices About this project Thi

Data Analytics Lab at Texas A&M University 44 Nov 18, 2022
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extens

TensorLayer Community 7.1k Dec 27, 2022
Automatic meme generation model using Tensorflow Keras.

Memefly You can find the project at MemeflyAI. Contributors Nick Buukhalter Harsh Desai Han Lee Project Overview Trello Board Product Canvas Automatic

BloomTech Labs 2 Jan 13, 2022
Analysis of Smiles through reservoir sampling & RDkit

Analysis of Smiles through reservoir sampling and machine learning (under development). This is a simple project that includes two Jupyter files for t

Aurimas A. Nausėdas 6 Aug 30, 2022
Code for the paper "Learning-Augmented Algorithms for Online Steiner Tree"

Learning-Augmented Algorithms for Online Steiner Tree This is the code for the paper "Learning-Augmented Algorithms for Online Steiner Tree". Requirem

0 Dec 09, 2021
(CVPR 2021) Lifting 2D StyleGAN for 3D-Aware Face Generation

Lifting 2D StyleGAN for 3D-Aware Face Generation Official implementation of paper "Lifting 2D StyleGAN for 3D-Aware Face Generation". Requirements You

Yichun Shi 66 Nov 29, 2022
Source code for "Interactive All-Hex Meshing via Cuboid Decomposition [SIGGRAPH Asia 2021]".

Interactive All-Hex Meshing via Cuboid Decomposition Video demonstration This repository contains an interactive software to the PolyCube-based hex-me

Lingxiao Li 131 Dec 05, 2022
Codes for the compilation and visualization examples to the HIF vegetation dataset

High-impedance vegetation fault dataset This repository contains the codes that compile the "Vegetation Conduction Ignition Test Report" data, which a

1 Dec 12, 2021
PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations This is the official PyTorch implementation

Multimedia Technology and Telecommunication Lab 42 Nov 09, 2022
Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences

Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences This repository is an official PyTorch implementation of Neighbor

DIVE Lab, Texas A&M University 8 Jun 12, 2022
Simple, efficient and flexible vision toolbox for mxnet framework.

MXbox: Simple, efficient and flexible vision toolbox for mxnet framework. MXbox is a toolbox aiming to provide a general and simple interface for visi

Ligeng Zhu 31 Oct 19, 2019
Official Implementation of "DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization."

DialogLM Code for AAAI 2022 paper: DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization. Pre-trained Models We release two ve

Microsoft 92 Dec 19, 2022
pytorch implementation of dftd2 & dftd3

torch-dftd pytorch implementation of dftd2 [1] & dftd3 [2, 3] Install # Install from pypi pip install torch-dftd # Install from source (for developer

33 Nov 28, 2022
This MVP data web app uses the Streamlit framework and Facebook's Prophet forecasting package to generate a dynamic forecast from your own data.

📈 Automated Time Series Forecasting Background: This MVP data web app uses the Streamlit framework and Facebook's Prophet forecasting package to gene

Zach Renwick 42 Jan 04, 2023
Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation"

TriBERT This repository contains the code for the NeurIPS 2021 paper titled "TriBERT: Full-body Human-centric Audio-visual Representation Learning for

UBC Computer Vision Group 8 Aug 31, 2022
Full Stack Deep Learning Labs

Full Stack Deep Learning Labs Welcome! Project developed during lab sessions of the Full Stack Deep Learning Bootcamp. We will build a handwriting rec

Full Stack Deep Learning 1.2k Dec 31, 2022
Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Features"

EDM-subgenre-classifier This repository contains the code for "Deep Learning Based EDM Subgenre Classification using Mel-Spectrogram and Tempogram Fea

11 Dec 20, 2022