Video Contrastive Learning with Global Context

Last update: Dec 26, 2022

Overview

Video Contrastive Learning with Global Context (VCLR)

This is the official PyTorch implementation of our VCLR paper.

Install dependencies

environments

conda create --name vclr python=3.7
conda activate vclr
conda install numpy scipy scikit-learn matplotlib scikit-image
pip install torch==1.7.1 torchvision==0.8.2
pip install opencv-python tqdm termcolor gcc7 ffmpeg tensorflow==1.15.2
pip install mmcv-full==1.2.7

Prepare datasets

Please refer to PREPARE_DATA to prepare the datasets.

Prepare pretrained MoCo weights

In this work, we follow SeCo and use the pretrained weights of MoCov2 as initialization.

cd ~
git clone https://github.com/amazon-research/video-contrastive-learning.git
cd video-contrastive-learning
mkdir pretrain && cd pretrain
wget https://dl.fbaipublicfiles.com/moco/moco_checkpoints/moco_v2_200ep/moco_v2_200ep_pretrain.pth.tar
cd ..

Self-supervised pretraining

bash shell/main_train.sh

Checkpoints will be saved to ./results

Downstream tasks

Linear evaluation

In order to evaluate the effectiveness of self-supervised learning, we conduct a linear evaluation (probing) on Kinetics400 dataset. Basically, we first extract features from the pretrained weight and then train a SVM classifier to see how the learned features perform.

bash shell/eval_svm.sh

Results

Arch Pretrained dataset Epoch Pretrained model Acc. on K400

ResNet50 Kinetics400 400 Download link 64.1

Arch	Pretrained dataset	Epoch	Pretrained model	Acc. on K400
ResNet50	Kinetics400	400	Download link	64.1

Video retrieval

bash shell/eval_retrieval.sh

Results

Arch	Pretrained dataset	Epoch	Pretrained model	[email protected] on UCF101	[email protected] on HMDB51
ResNet50	Kinetics400	400	Download link	70.6	35.2
ResNet50	UCF101	400	Download link	46.8	17.6

Action recognition & action localization

Here, we use mmaction2 for both tasks. If you are not familiar with mmaction2, you can read the official documentation.

Installation

Step1: Install mmaction2

To make sure the results can be reproduced, please use our forked version of mmaction2 (version: 0.11.0):
```
conda activate vclr
cd ~
git clone https://github.com/KuangHaofei/mmaction2

cd mmaction2
pip install -v -e .
```
Step2: Prepare the pretrained weights

Our pretrained backbone have different format with the backbone of mmaction2, it should be transferred to mmaction2 format. We provide the transferred version of our K400 pretrained weights, TSN and TSM. We also provide the script for transferring weights, you can find it here.

Moving the pretrained weights to checkpoints directory:
```
cd ~/mmaction2
mkdir checkpoints
wget https://haofeik-data.s3.amazonaws.com/VCLR/pretrained/vclr_mm.pth
wget https://haofeik-data.s3.amazonaws.com/VCLR/pretrained/vclr_mm_tsm.pth
```

Action recognition

Make sure you have prepared the dataset and environments following the previous step. Now suppose you are in the root directory of mmaction2, follow the subsequent steps to fine tune the TSN or TSM models for action recognition.

For each dataset, the train and test setting can be found in the configuration files.

UCF101

config file: tsn_ucf101.py

train command:

./tools/dist_train.sh configs/recognition/tsn/vclr/tsn_ucf101.py 8 \
  --validate --seed 0 --deterministic

test command:

python tools/test.py configs/recognition/tsn/vclr/tsn_ucf101.py \
  work_dirs/vclr/ucf101/latest.pth \
  --eval top_k_accuracy mean_class_accuracy --out result.json

HMDB51

config file: tsn_hmdb51.py

train command:

./tools/dist_train.sh configs/recognition/tsn/vclr/tsn_hmdb51.py 8 \
  --validate --seed 0 --deterministic

test command:

python tools/test.py configs/recognition/tsn/vclr/tsn_hmdb51.py \
  work_dirs/vclr/hmdb51/latest.pth \
  --eval top_k_accuracy mean_class_accuracy --out result.json

SomethingSomethingV2: TSN

config file: tsn_sthv2.py

train command:

./tools/dist_train.sh configs/recognition/tsn/vclr/tsn_sthv2.py 8 \
  --validate --seed 0 --deterministic

test command:

python tools/test.py configs/recognition/tsn/vclr/tsn_sthv2.py \
  work_dirs/vclr/tsn_sthv2/latest.pth \
  --eval top_k_accuracy mean_class_accuracy --out result.json

SomethingSomethingV2: TSM

config file: tsm_sthv2.py

train command:

./tools/dist_train.sh configs/recognition/tsm/vclr/tsm_sthv2.py 8 \
  --validate --seed 0 --deterministic

test command:

python tools/test.py configs/recognition/tsm/vclr/tsm_sthv2.py \
  work_dirs/vclr/tsm_sthv2/latest.pth \
  --eval top_k_accuracy mean_class_accuracy --out result.json

ActivityNet

config file: tsn_activitynet.py

train command:

./tools/dist_train.sh configs/recognition/tsn/vclr/tsn_activitynet.py 8 \
  --validate --seed 0 --deterministic

test command:

python tools/test.py configs/recognition/tsn/vclr/tsn_activitynet.py \
  work_dirs/vclr/tsn_activitynet/latest.pth \
  --eval top_k_accuracy mean_class_accuracy --out result.json

Results

Arch	Dataset	Finetuned model	Acc.
TSN	UCF101	Download link	85.6
TSN	HMDB51	Download link	54.1
TSN	SomethingSomethingV2	Download link	33.3
TSM	SomethingSomethingV2	Download link	52.0
TSN	ActivityNet	Download link	71.9

Action localization

Step 1: Follow the previous section, suppose the finetuned model is saved at work_dirs/vclr/tsn_activitynet/latest.pth

Step 2: Extract ActivityNet features

cd ~/mmaction2/tools/data/activitynet/

python tsn_feature_extraction.py --data-prefix /home/ubuntu/data/ActivityNet/rawframes \
  --data-list /home/ubuntu/data/ActivityNet/anet_train_video.txt \
  --output-prefix /home/ubuntu/data/ActivityNet/rgb_feat \
  --modality RGB --ckpt /home/ubuntu/mmaction2/work_dirs/vclr/tsn_activitynet/latest.pth

python tsn_feature_extraction.py --data-prefix /home/ubuntu/data/ActivityNet/rawframes \
  --data-list /home/ubuntu/data/ActivityNet/anet_val_video.txt \
  --output-prefix /home/ubuntu/data/ActivityNet/rgb_feat \
  --modality RGB --ckpt /home/ubuntu/mmaction2/work_dirs/vclr/tsn_activitynet/latest.pth

python activitynet_feature_postprocessing.py \
  --rgb /home/ubuntu/data/ActivityNet/rgb_feat \
  --dest /home/ubuntu/data/ActivityNet/mmaction_feat

Note, the root directory of ActivityNey is /home/ubuntu/data/ActivityNet/ in our case. Please replace it according to your real directory.

Step 3: Train and test the BMN model

train

cd ~/mmaction2
./tools/dist_train.sh configs/localization/bmn/bmn_acitivitynet_feature_vclr.py 2 \
  --work-dir work_dirs/vclr/bmn_activitynet --validate --seed 0 --deterministic --bmn

test

python tools/test.py configs/localization/bmn/bmn_acitivitynet_feature_vclr.py \
  work_dirs/vclr/bmn_activitynet/latest.pth \
  --bmn --eval [email protected] --out result.json

Results

Arch Dataset Finetuned model AUC [email protected]

BMN ActivityNet Download link 65.5 73.8

Arch	Dataset	Finetuned model	AUC	[email protected]
BMN	ActivityNet	Download link	65.5	73.8

Feature visualization

We provide our feature visualization code at here.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Video Contrastive Learning with Global Context

Related tags

Overview

Video Contrastive Learning with Global Context (VCLR)

Install dependencies

Prepare datasets

Prepare pretrained MoCo weights

Self-supervised pretraining

Downstream tasks

Linear evaluation

Video retrieval

Action recognition & action localization

Installation

Action recognition

Action localization

Feature visualization

Security

License

Owner

Multi-Stage Progressive Image Restoration

AI assistant built in python.the features are it can display time,say weather,open-google,youtube,instagram.

Source Code for our paper: Understand me, if you refer to Aspect Knowledge: Knowledge-aware Gated Recurrent Memory Network

Official implementation of Self-supervised Graph Attention Networks (SuperGAT), ICLR 2021.

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Generating Fractals on Starknet with Cairo

Time Dependent DFT in Tamm-Dancoff Approximation

Scripts of Machine Learning Algorithms from Scratch. Implementations of machine learning models and algorithms using nothing but NumPy with a focus on accessibility. Aims to cover everything from basic to advance.

Official implementation of the article "Unsupervised JPEG Domain Adaptation For Practical Digital Forensics"

rliable is an open-source Python library for reliable evaluation, even with a handful of runs, on reinforcement learning and machine learnings benchmarks.

PyTorch implementation of SIFT descriptor

Semi-supervised Adversarial Learning to Generate Photorealistic Face Images of New Identities from 3D Morphable Model

My implementation of transformers related papers for computer vision in pytorch

Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation

Winning Solution in NTIRE19 Challenges on Video Restoration and Enhancement (CVPR19 Workshops) - Video Restoration with Enhanced Deformable Convolutional Networks. EDVR has been merged into BasicSR and this repo is a mirror of BasicSR.

Implementation of the GBST block from the Charformer paper, in Pytorch

PyTorch Live is an easy to use library of tools for creating on-device ML demos on Android and iOS.

Misc YOLOL scripts for use in the Starbase space sandbox videogame

This program automatically runs Python code copied in clipboard

Trafffic prediction analysis using hybrid models - Machine Learning