Self-Regulated Learning for Egocentric Video Activity Anticipation

Last update: Sep 23, 2022

Related tags

Overview

Self-Regulated Learning for Egocentric Video Activity Anticipation

Introduction

This is a Pytorch implementation of the model described in our paper:

Z. Qi, S. Wang, C. Su, L. Su, Q. Huang, and Q. Tian. Self-Regulated Learning for Egocentric Video Activity Anticipation. TPAMI 2021.

Dependencies

Pytorch >= 1.0.1
Cuda 9.0.176
Cudnn 7.4.2
Python 3.6.8

Data

EPIC-Kitchens dataset

For the raw data of the EPIC-Kitchens dataset, please refer to https://github.com/epic-kitchens/download-scripts to download.

For the three modality features (rgb, flow, obj), please refer to https://github.com/fpv-iplab/rulstm to download. After downloading, put them in the folder './data'.

EGTEA Gaze+ dataset

For the raw data of the EGTEA Gaze+ dataset, please refer to http://cbs.ic.gatech.edu/fpv/ to download.

For the extracted features, please refer to https://github.com/fpv-iplab/rulstm to download. After downloading, put them in the folder './data'.

50 Salads dataset

For the raw data of the 50 Salads dataset, please refer to http://cvip.computing.dundee.ac.uk/datasets/foodpreparation/50salads/ to download.

For the extracted features, please refer to https://github.com/colincsl/TemporalConvolutionalNetworks to download. After downloading, put them in the folder './data'.

Breakfast dataset

For the raw data of the Breakfast dataset, please refer to https://serre-lab.clps.brown.edu/resource/breakfast-actions-dataset/ to download.

For the extraced I3D features, please download from Baidu passward: 'wub3' or Google Drive. After downloading, put them in the folder './data'.

Train for Epic-Kitchen dataset

For rgb feature, python main.py --gpu_ids 0 --batch_size 128 --wd 1e-5 --lr 0.1 --reinforce_verb_weight 0.01 --reinforce_noun_weight 0.01 --revision_weight 0.8 --mode train --modality rgb --hidden 1024 --feat_in 1024

Silimar commonds can be used for flow or obj features.

Validation for Epic-Kitchen dataset

Please download the pre-trained model weigths from Baidu passward: 'wub3' or Google Drive, and put them in the folder './results/EPIC/base_srl/pre_trained/'.

For rgb feature, python main.py --gpu_ids 0 --batch_size 128 --mode validate --modality rgb --hidden 1024 --feat_in 1024 --resume_timestamp pre_trained

For flow feature, python main.py --gpu_ids 0 --batch_size 128 --mode validate --modality flow --hidden 1024 --feat_in 1024 --resume_timestamp pre_trained

For obj feature, python main.py --gpu_ids 0 --batch_size 128 --mode validate --modality obj --hidden 352 --feat_in 352 --resume_timestamp pre_trained

For three modality features, python main.py --gpu_ids 0 --batch_size 128 --mode validate --modality fusion --resume_timestamp pre_trained

Citation

Please cite our paper if you use this code in your own work:

@article{qi2021self,
  title={Self-Regulated Learning for Egocentric Video Activity Anticipation},
  author={Qi, Zhaobo and Wang, Shuhui and Su, Chi and Su, Li and Huang, Qingming and Tian, Qi},
  journal={IEEE Transactions on Pattern Analysis \& Machine Intelligence},
  number={01},
  pages={1--1},
  year={2021},
  publisher={IEEE Computer Society}
}

Concat

If you have any problem about our code, feel free to contact

[email protected]

Self-Regulated Learning for Egocentric Video Activity Anticipation

Related tags

Overview

Self-Regulated Learning for Egocentric Video Activity Anticipation

Introduction

Dependencies

Data

EPIC-Kitchens dataset

EGTEA Gaze+ dataset

50 Salads dataset

Breakfast dataset

Train for Epic-Kitchen dataset

Validation for Epic-Kitchen dataset

Citation

Concat

Owner

qzhb

A PyTorch implementation of unsupervised SimCSE

Attention-based CNN-LSTM and XGBoost hybrid model for stock prediction

The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".

TLXZoo - Pre-trained models based on TensorLayerX

A program that uses computer vision to detect hand gestures, used for controlling movie players.

🗺 General purpose U-Network implemented in Keras for image segmentation

The official implementation of paper Siamese Transformer Pyramid Networks for Real-Time UAV Tracking, accepted by WACV22

Implementation of the federated dual coordinate descent (FedDCD) method.

Source code for the paper "Periodic Traveling Waves in an Integro-Difference Equation With Non-Monotonic Growth and Strong Allee Effect"

The implementation of the lifelong infinite mixture model

YOLOv5 🚀 is a family of object detection architectures and models pretrained on the COCO dataset

Pytorch GUI(demo) for iVOS(interactive VOS) and GIS (Guided iVOS)

Project NII pytorch scripts

Greedy Gaussian Segmentation

Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset

[ICCV 2021] HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration

This GitHub repository contains code used for plots in NeurIPS 2021 paper 'Stochastic Multi-Armed Bandits with Control Variates.'

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.

The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"