[AAAI2022] Source code for our paper《Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning》

Last update: Oct 26, 2022

Related tags

Deep Learning SSVC

Overview

SSVC

The source code for paper [Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning]

samples of the generated motion-preserved video with threshold $\alpha=0.5$.

Requirements

python3
torch1.1+
PIL
FrEIA==0.2 (Flow-based model)
lintel==1.0 (Decode mp4 videos on the fly)

Structure

backbone
data
- lists: train/val lists (.txt)
- augmentation.py: train/val data augmentation during ssl pre-training
- vDataLoader.py: custom your path to data list
model
- advflow: flow-based model
- classifier.py: linear classifier for down-stream tasks
- infonce.py: combine S$^2$VC with MoCo
flow
- pre-trained flow-based model weights
utils
main_pretrain.py: the main function for self-supervised pretrain
main_eval.py: the main function for supervised fine-tune

Self-supervised Pretrain

DDP

python -m torch.distributed.launch --nproc_per_node=1 --master_port 1234 main_pretrain.py --net r3d18 --img_dim 112 --seq_len 16 --aug_type 1 -t 0.5 -bsz 64 --gpu 0,1 --dataset XX

Single GPU

python main_pretrain.py --net r3d18 --img_dim 112 --seq_len 16 --aug_type 1 -t 0.5 -bsz 64 --gpu 0 --dataset XX

Evaluation

NN-Retrieval

python main_eval.py --retrieval --test SSL_Pt_Model_PTH --dataset XX --gpu X

Finetune

# fine-tune overall model
python main_eval.py --train_what ft --pretrain SSL_Pt_Model_PTH --dataset XX --gpu XX \
--net r3d18 --img_dim 224 --seq_len 32

# freeze backbone, finetune last layer
python main_eval.py --train_what last --pretrain SSL_Pt_Model_PTH --dataset XX --gpu XX \
--net r3d18 --img_dim 224 --seq_len 32

Test

python main_eval.py --train_what XX --ten_crop --test Sup_Ft_Model_PTH --gpu X \
--dataset XX --net r3d18 --img_dim 224 --seq_len 32

[AAAI2022] Source code for our paper《Suppressing Static Visual Cues via Normalizing Flows for Self-Supervised Video Representation Learning》

Related tags

Overview

SSVC

Requirements

Structure

Self-supervised Pretrain

DDP

Single GPU

Evaluation

NN-Retrieval

Finetune

Test

Owner

A Keras implementation of CapsNet in the paper: Sara Sabour, Nicholas Frosst, Geoffrey E Hinton. Dynamic Routing Between Capsules

This repository collects project-relevant Isabelle/HOL formalizations.

Deep learned, hardware-accelerated 3D object pose estimation

Implementation of TimeSformer, a pure attention-based solution for video classification

Code for paper "Extract, Denoise and Enforce: Evaluating and Improving Concept Preservation for Text-to-Text Generation" EMNLP 2021

Densely Connected Convolutional Networks, In CVPR 2017 (Best Paper Award).

Plugin adapted from Ultralytics to bring YOLOv5 into Napari

The code for the CVPR 2021 paper Neural Deformation Graphs, a novel approach for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects.

MTA:SA Server Configer.

A cool little repl-based simulation written in Python

CrossNorm and SelfNorm for Generalization under Distribution Shifts (ICCV 2021)

The project is associated with the recently-launched ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge (M2MeT) to provide participants with baseline systems for speech recognition and speaker diarization in conference scenario.

SegNet-Basic with Keras

Display, filter and search log messages in your terminal

The official repo of the CVPR2021 oral paper: Representative Batch Normalization with Feature Calibration

Libraries, tools and tasks created and used at DeepMind Robotics.

Small-bets - Ergodic Experiment With Python

A strongly-typed genetic programming framework for Python

This is our ARTS test set, an enriched test set to probe Aspect Robustness of ABSA.

A library built upon PyTorch for building embeddings on discrete event sequences using self-supervision