Towards Long-Form Video Understanding

Chao-Yuan Wu, Philipp Krähenbühl, CVPR 2021

[Paper] [Project Page] [Dataset]

Citation

@inproceedings{lvu2021,
  Author    = {Chao-Yuan Wu and Philipp Kr\"{a}henb\"{u}hl},
  Title     = {{Towards Long-Form Video Understanding}},
  Booktitle = {{CVPR}},
  Year      = {2021}}

Overview

This repo implements Object Transformers for long-form video understanding.

Getting Started

Please organize data/ as follows

data
|_ ava
|_ features
|_ instance_meta
|_ lvu_1.0

ava, features, and instance_meta could be found at this Google Drive folder. lvu_1.0 can be found at here.

Please also download pre-trained weights at this Google Drive folder and put them in pretrained_models/.

Pre-training

bash run_pretrain.sh

This pretrains on a small demo dataset data/instance_meta/instance_meta_pretrain_demo.pkl as an example. Please follow its file format if you'd like to pretrain on a larger dataset (e.g., latest full version of MovieClips).

Training and evaluating on AVA v2.2

bash run_ava.sh

This should achieve 31.0 mAP.

Training and evaluating on LVU tasks

bash run.sh [1-9]

The argument selects a task to run on. Please see run.py for details.

Acknowledgment

This implementation largely borrows from Huggingface Transformers. Please consider citing it if you use this repo.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
figs		figs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
run.sh		run.sh
run_ava.sh		run_ava.sh
run_pretrain.sh		run_pretrain.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

figs

figs

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

run.sh

run.sh

run_ava.sh

run_ava.sh

run_pretrain.sh

run_pretrain.sh

Repository files navigation

Towards Long-Form Video Understanding

[Paper] [Project Page] [Dataset]

Citation

Overview

Getting Started

Pre-training

Training and evaluating on AVA v2.2

Training and evaluating on LVU tasks

Acknowledgment

About

Releases

Packages

Languages

License

chaoyuaw/lvu

Folders and files

Latest commit

History

Repository files navigation

Towards Long-Form Video Understanding

Citation

Overview

Getting Started

Pre-training

Training and evaluating on AVA v2.2

Training and evaluating on LVU tasks

Acknowledgment

About

Resources

License

Stars

Watchers

Forks

Languages