Self-Regulated Learning for Egocentric Video Activity Anticipation

Last update: Sep 23, 2022

Related tags

Overview

Self-Regulated Learning for Egocentric Video Activity Anticipation

Introduction

This is a Pytorch implementation of the model described in our paper:

Z. Qi, S. Wang, C. Su, L. Su, Q. Huang, and Q. Tian. Self-Regulated Learning for Egocentric Video Activity Anticipation. TPAMI 2021.

Dependencies

Pytorch >= 1.0.1
Cuda 9.0.176
Cudnn 7.4.2
Python 3.6.8

Data

EPIC-Kitchens dataset

For the raw data of the EPIC-Kitchens dataset, please refer to https://github.com/epic-kitchens/download-scripts to download.

For the three modality features (rgb, flow, obj), please refer to https://github.com/fpv-iplab/rulstm to download. After downloading, put them in the folder './data'.

EGTEA Gaze+ dataset

For the raw data of the EGTEA Gaze+ dataset, please refer to http://cbs.ic.gatech.edu/fpv/ to download.

For the extracted features, please refer to https://github.com/fpv-iplab/rulstm to download. After downloading, put them in the folder './data'.

50 Salads dataset

For the raw data of the 50 Salads dataset, please refer to http://cvip.computing.dundee.ac.uk/datasets/foodpreparation/50salads/ to download.

For the extracted features, please refer to https://github.com/colincsl/TemporalConvolutionalNetworks to download. After downloading, put them in the folder './data'.

Breakfast dataset

For the raw data of the Breakfast dataset, please refer to https://serre-lab.clps.brown.edu/resource/breakfast-actions-dataset/ to download.

For the extraced I3D features, please download from Baidu passward: 'wub3' or Google Drive. After downloading, put them in the folder './data'.

Train for Epic-Kitchen dataset

For rgb feature, python main.py --gpu_ids 0 --batch_size 128 --wd 1e-5 --lr 0.1 --reinforce_verb_weight 0.01 --reinforce_noun_weight 0.01 --revision_weight 0.8 --mode train --modality rgb --hidden 1024 --feat_in 1024

Silimar commonds can be used for flow or obj features.

Validation for Epic-Kitchen dataset

Please download the pre-trained model weigths from Baidu passward: 'wub3' or Google Drive, and put them in the folder './results/EPIC/base_srl/pre_trained/'.

For rgb feature, python main.py --gpu_ids 0 --batch_size 128 --mode validate --modality rgb --hidden 1024 --feat_in 1024 --resume_timestamp pre_trained

For flow feature, python main.py --gpu_ids 0 --batch_size 128 --mode validate --modality flow --hidden 1024 --feat_in 1024 --resume_timestamp pre_trained

For obj feature, python main.py --gpu_ids 0 --batch_size 128 --mode validate --modality obj --hidden 352 --feat_in 352 --resume_timestamp pre_trained

For three modality features, python main.py --gpu_ids 0 --batch_size 128 --mode validate --modality fusion --resume_timestamp pre_trained

Citation

Please cite our paper if you use this code in your own work:

@article{qi2021self,
  title={Self-Regulated Learning for Egocentric Video Activity Anticipation},
  author={Qi, Zhaobo and Wang, Shuhui and Su, Chi and Su, Li and Huang, Qingming and Tian, Qi},
  journal={IEEE Transactions on Pattern Analysis \& Machine Intelligence},
  number={01},
  pages={1--1},
  year={2021},
  publisher={IEEE Computer Society}
}

Concat

If you have any problem about our code, feel free to contact

[email protected]

Self-Regulated Learning for Egocentric Video Activity Anticipation

Related tags

Overview

Self-Regulated Learning for Egocentric Video Activity Anticipation

Introduction

Dependencies

Data

EPIC-Kitchens dataset

EGTEA Gaze+ dataset

50 Salads dataset

Breakfast dataset

Train for Epic-Kitchen dataset

Validation for Epic-Kitchen dataset

Citation

Concat

Owner

qzhb

Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)

PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".

DC540 hacking challenge 0x00005a.

Joint Detection and Identification Feature Learning for Person Search

[AAAI22] Reliable Propagation-Correction Modulation for Video Object Segmentation

BookMyShowPC - Movie Ticket Reservation App made with Tkinter

[CVPR'21] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

Code for NeurIPS 2021 paper "Curriculum Offline Imitation Learning"

Charsiu: A transformer-based phonetic aligner

Companion repo of the UCC 2021 paper "Predictive Auto-scaling with OpenStack Monasca"

Storage-optimizer - Identify potintial optimizations on the cloud storage accounts

Densely Connected Search Space for More Flexible Neural Architecture Search (CVPR2020)

A self-supervised 3D representation learning framework named viewpoint bottleneck.

Code repo for "RBSRICNN: Raw Burst Super-Resolution through Iterative Convolutional Neural Network" (Machine Learning and the Physical Sciences workshop in NeurIPS 2021).

Crowd-Kit is a powerful Python library that implements commonly-used aggregation methods for crowdsourced annotation and offers the relevant metrics and datasets

Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks

Pytorch port of Google Research's LEAF Audio paper

Single Image Random Dot Stereogram for Tensorflow

Notspot robot simulation - Python version

[ACM MM2021] MGH: Metadata Guided Hypergraph Modeling for Unsupervised Person Re-identification