LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

Last update: Oct 11, 2022

Related tags

Overview

LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

This Repository contains the code on AVA of our ACM MM 2021 paper: LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

Installation

See INSTALL.md for details on installing the codebase, including requirement and environment settings

Data

For data preparation and setup, our LSTC strictly follows the processing of PySlowFast, See DATASET.md for details on preparing the data.

Run the code

We take SlowFast-ResNet50 as an example

train the model

python3 tools/run_net.py --cfg config/AVA/SLOWFAST_32x12_R50_LFB.yaml \
    AVA.FEATURE_BANK_PATH 'path/to/feature/bank/folder' \
    TRAIN.CHECKPOINT_FILE_PATH 'path/to/pretrained/backbone' \
    OUTPUT_DIR 'path/to/output/folder'

test the model

python3 tools/run_net.py --cfg config/AVA/SLOWFAST_32x12_R50_LFB.yaml \
    AVA.FEATURE_BANK_PATH 'path/to/feature/bank/folder' \
    OUTPUT_DIR 'path/to/output/folder' \
    TRAIN.ENABLE False \ 
    TEST.ENABLE True

If you want to start the DDP training from command line with torch.distributed.launch, please set start_method='cmd' in tools/run_net.py

Resource

The codebase provide following resources for fast training and validation

Pretrained backbone on Kinetics

backbone	dataset	model type	link
ResNet50	Kinetics400	Caffe2	Google Drive/Baidu Disk (Code: y1wl)
ResNet101	Kinetics600	Caffe2	Google Drive/Baidu Disk (Code: slde)

Extracted long term feature bank

backbone	feature bank (LMDB)	dimension
ResNet50	Google Drive	1280
ResNet101	Google Drive	2304

Checkpoint file

backbone	checkpoint	model type
ResNet50	Google Drive/Baidu Disk (Code: fi0s)	pytorch
ResNet101	Google Drive/Baidu Disk (Code: g63o)	pytorch

Acknowledgement

This codebase is built upon PySlowFast.

Citation

If you find this repository helps your research, please refer following paper

@InProceedings{Yuxi_2021_ACM,
  author = {Li, Yuxi and Zhang, Boshen and Li, Jian and Wang, Yabiao and Wang, Chengjie and Li, Jilin and Huang, Feiyue and Lin, Weiyao},
  title = {LSTC: Boosting Atomic Action Detection with Long-Short-Term Context},
  booktitle = {ACM Conference on Multimedia},
  month = {October},
  year = {2021}
}

LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

Related tags

Overview

LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

Installation

Data

Run the code

Resource

Pretrained backbone on Kinetics

Extracted long term feature bank

Checkpoint file

Acknowledgement

Citation

Owner

Tencent YouTu Research

Line as a Visual Sentence: Context-aware Line Descriptor for Visual Localization

用Resnet101+GPT搭建一个玩王者荣耀的AI

Chinese Grammatical Error Diagnosis

Code for text augmentation method leveraging large-scale language models

A spaCy wrapper of OpenTapioca for named entity linking on Wikidata

Text-to-Speech for Belarusian language

The FinQA dataset from paper: FinQA: A Dataset of Numerical Reasoning over Financial Data

Final Project for the Intel AI Readiness Boot Camp NLP (Jan)

Search with BERT vectors in Solr and Elasticsearch

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

The code from the whylogs workshop in DataTalks.Club on 29 March 2022

A combination of autoregressors and autoencoders using XLNet for sentiment analysis

Parrot is a paraphrase based utterance augmentation framework purpose built to accelerate training NLU models

Implementation of Natural Language Code Search in the project CodeBERT: A Pre-Trained Model for Programming and Natural Languages.

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

Pretrain CPM - 大规模预训练语言模型的预训练代码

Arabic speech recognition, classification and text-to-speech.

T‘rex Park is a Youzan sponsored project. Offering Chinese NLP and image models pretrained from E-commerce datasets

A simple recipe for training and inferencing Transformer architecture for Multi-Task Learning on custom datasets. You can find two approaches for achieving this in this repo.

Package for controllable summarization