Codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

Overview

Contrast and Mix (CoMix)

The repository contains the codes for the paper Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing part of Advances in Neural Information Processing Systems (NeurIPS) 2021.

Aadarsh Sahoo1, Rutav Shah1, Rameswar Panda2, Kate Saenko2,3, Abir Das1

1 IIT Kharagpur, 2 MIT-IBM Watson AI Lab, 3 Boston University

[Paper] [Project Page]

 

Fig. Temporal Contrastive Learning with Background Mixing and Target Pseudo-labels. Temporal contrastive loss (left) contrasts a single temporally augmented positive (same video, different speed) per anchor against rest of the videos in a mini-batch as negatives. Incorporating background mixing (middle) provides additional positives per anchor possessing same action semantics with a different background alleviating background shift across domains. Incorporating target pseudo-labels (right) additionally enhances the discriminabilty by contrasting the target videos with the same pseudo-label as positives against rest of the videos as negatives.

 

Preparing the Environment

Conda

Please use the comix_environment.yml file to create the conda environment comix as:

conda env create -f comix_environment.yml

Pip

Please use the requirements.txt file to install all the required dependencies as:

pip install -r requirements.txt

Data Directory Structure

All the datasets should be stored in the folder ./data following the convention ./data/ and it must be passed as an argument to base_dir=./data/ .

UCF - HMDB

For ucf_hmdb dataset with base_dir=./data/ucf_hmdb the structure would be as follows:

.
├── ...
├── data
│   ├── ucf_hmdb
│   │   ├── ucf_videos
|   |   |   ├── 
   
    
|   |   |   |   ├── 
    
     
|   |   |   |   ├── 
     
      
|   |   |   |   ├── ...
|   |   |   ├── 
      
       
|   |   |   ├── ...
│   │   ├── hmdb_videos
|   |   ├── ucf_BG
|   |   └── hmdb_BG
│   └──
└──

      
     
    
   
Jester

For Jester dataset with base_dir=./data/jester the structure would be as follows

.
├── ...
├── data
│   ├── jester
|   |   ├── jester_videos
|   |   |   ├── 
   
    
|   |   |   |   ├── 
    
     
|   |   |   |   ├── 
     
      
|   |   |   |   ├── ...
|   |   |   ├── 
      
       
|   |   |   ├── ...
|   |   ├── jester_BG
|   |   |   ├── 
       
         | | | | ├── 
        
          | | | ├── ... └── └── └── 
        
       
      
     
    
   
Epic-Kitchens

For Epic Kitchens dataset with base_dir=./data/epic_kitchens the structure would be as follows (we follow the same structure as in the original dataset) :

.
├── ...
├── data
│   ├── epic_kitchens
|   |   ├── epic_kitchens_videos
|   |   |   ├── train
|   |   |   |   ├── D1
|   |   |   |   |   ├── 
   
    
|   |   |   |   |   |   ├── 
    
     
|   |   |   |   |   |   ├── 
     
      
|   |   |   |   |   |   ├── ...
|   |   |   |   |   ├── 
      
       
|   |   |   |   |   ├── ...
|   |   |   |   ├── D2
|   |   |   |   └── D3
|   |   |   └── test
└── └── └── epic_kitchens_BG

      
     
    
   

For using datasets stored in some other directories, please pass the parameter base_dir accordingly.

Background Extraction using Temporal Median Filtering

Please refer to the folder ./background_extraction for the codes to extract backgrounds using temporal median filtering.

Data

All the required split files are provided inside the directory ./video_splits.

The official download links for the datasets used for this paper are: [UCF-101] [HMDB-51] [Jester] [Epic Kitchens]

Training CoMix

Here are some of the sample and recomended commands to train CoMix for the transfer task of:

UCF -> HMDB from UCF-HMDB dataset:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --manual_seed 1 --dataset_name UCF-HMDB --src_dataset UCF --tgt_dataset HMDB --batch_size 8 --model_root ./checkpoints_ucf_hmdb --save_in_steps 500 --log_in_steps 50 --eval_in_steps 50 --pseudo_threshold 0.7 --warmstart_models True --num_iter_warmstart 4000 --num_iter_adapt 10000 --learning_rate 0.01 --learning_rate_ws 0.01 --lambda_bgm 0.1 --lambda_tpl 0.01 --base_dir ./data/ucf_hmdb

S -> T from Jester dataset:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --manual_seed 1 --dataset_name Jester --src_dataset S --tgt_dataset T --batch_size 8 --model_root ./checkpoints_jester --save_in_steps 500 --log_in_steps 50 --eval_in_steps 50 --pseudo_threshold 0.7 --warmstart_models True --num_iter_warmstart 4000 --num_iter_adapt 10000 --learning_rate 0.01 --learning_rate_ws 0.01 --lambda_bgm 0.1 --lambda_tpl 0.1 --base_dir ./data/jester

D1 -> D2 from Epic-Kitchens dataset:

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --manual_seed 1 --dataset_name Epic-Kitchens --src_dataset D1 --tgt_dataset D2 --batch_size 8 --model_root ./checkpoints_epic_d1_d2 --save_in_steps 500 --log_in_steps 50 --eval_in_steps 50 --pseudo_threshold 0.7 --warmstart_models True --num_iter_warmstart 4000 --num_iter_adapt 10000 --learning_rate 0.01 --learning_rate_ws 0.01 --lambda_bgm 0.01 --lambda_tpl 0.01 --base_dir ./data/epic_kitchens

For detailed description regarding the arguments, use:

python main.py --help

Citing CoMix

If you use codes in this repository, consider citing CoMix. Thanks!

@article{sahoo2021contrast,
  title={Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing},
  author={Sahoo, Aadarsh and Shah, Rutav and Panda, Rameswar and Saenko, Kate and Das, Abir},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}
Owner
Computer Vision and Intelligence Research (CVIR)
The Computer Vision and Intelligence Research (CVIR) group is part of the Department of Computer Science and Engineering at IIT Kharagpur.
Computer Vision and Intelligence Research (CVIR)
Picasso: a methods for embedding points in 2D in a way that respects distances while fitting a user-specified shape.

Picasso Code to generate Picasso embeddings of any input matrix. Picasso maps the points of an input matrix to user-defined, n-dimensional shape coord

Pachter Lab 45 Dec 23, 2022
Hierarchical Uniform Manifold Approximation and Projection

HUMAP Hierarchical Manifold Approximation and Projection (HUMAP) is a technique based on UMAP for hierarchical non-linear dimensionality reduction. HU

Wilson Estécio Marcílio Júnior 160 Jan 06, 2023
GitHub repository for the ICLR Computational Geometry & Topology Challenge 2021

ICLR Computational Geometry & Topology Challenge 2022 Welcome to the ICLR 2022 Computational Geometry & Topology challenge 2022 --- by the ICLR 2022 W

42 Dec 13, 2022
ppo_pytorch_cpp - an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch

PPO Pytorch C++ This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch. It uses a simple TestEnvironment t

Martin Huber 59 Dec 09, 2022
implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks

YOLOR implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks To reproduce the results in the paper, please us

Kin-Yiu, Wong 1.8k Jan 04, 2023
A tool to estimate time varying instantaneous reproduction number during epidemics

EpiEstim A tool to estimate time varying instantaneous reproduction number during epidemics. It is described in the following paper: @article{Cori2013

MRC Centre for Global Infectious Disease Analysis 78 Dec 19, 2022
PaSST: Efficient Training of Audio Transformers with Patchout

PaSST: Efficient Training of Audio Transformers with Patchout This is the implementation for Efficient Training of Audio Transformers with Patchout Pa

165 Dec 26, 2022
(under submission) Bayesian Integration of a Generative Prior for Image Restoration

BIGPrior: Towards Decoupling Learned Prior Hallucination and Data Fidelity in Image Restoration Authors: Majed El Helou, and Sabine Süsstrunk {Note: p

Majed El Helou 22 Dec 17, 2022
Official implementation of ETH-XGaze dataset baseline

ETH-XGaze baseline Official implementation of ETH-XGaze dataset baseline. ETH-XGaze dataset ETH-XGaze dataset is a gaze estimation dataset consisting

Xucong Zhang 134 Jan 03, 2023
Code for Towards Streaming Perception (ECCV 2020) :car:

sAP — Code for Towards Streaming Perception ECCV Best Paper Honorable Mention Award Feb 2021: Announcing the Streaming Perception Challenge (CVPR 2021

Martin Li 85 Dec 22, 2022
領域を指定し、キーを入力することで画像を保存するツールです。クラス分類用のデータセット作成を想定しています。

image-capture-class-annotation 領域を指定し、キーを入力することで画像を保存するツールです。 クラス分類用のデータセット作成を想定しています。 Requirement OpenCV 3.4.2 or later Usage 実行方法は以下です。 起動後はマウスクリック4

KazuhitoTakahashi 5 May 28, 2021
TYolov5: A Temporal Yolov5 Detector Based on Quasi-Recurrent Neural Networks for Real-Time Handgun Detection in Video

TYolov5: A Temporal Yolov5 Detector Based on Quasi-Recurrent Neural Networks for Real-Time Handgun Detection in Video Timely handgun detection is a cr

Mario Duran-Vega 18 Dec 26, 2022
Rendering Point Clouds with Compute Shaders

Compute Shader Based Point Cloud Rendering This repository contains the source code to our techreport: Rendering Point Clouds with Compute Shaders and

Markus Schütz 460 Jan 05, 2023
A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

A pure PyTorch batched computation implementation of "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition"

張致強 14 Dec 02, 2022
PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models

This is the official implementation of the following paper: Torsten Scholak, Nathan Schucher, Dzmitry Bahdanau. PICARD - Parsing Incrementally for Con

ElementAI 217 Jan 01, 2023
Element selection for functional materials discovery by integrated machine learning of atomic contributions to properties

Element selection for functional materials discovery by integrated machine learning of atomic contributions to properties 8.11.2021 Andrij Vasylenko I

Leverhulme Research Centre for Functional Materials Design 4 Dec 20, 2022
An offline deep reinforcement learning library

d3rlpy: An offline deep reinforcement learning library d3rlpy is an offline deep reinforcement learning library for practitioners and researchers. imp

Takuma Seno 817 Jan 02, 2023
System Combination for Grammatical Error Correction Based on Integer Programming

System Combination for Grammatical Error Correction Based on Integer Programming This repository contains the code and scripts that implement the syst

NUS NLP Group 0 Mar 29, 2022
The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework that ensures reliability, high concurrency and scalability of services.

savior是一个能够进行快速集成算法模块并支持高性能部署的轻量开发框架。能够帮助将团队进行快速想法验证(PoC),避免重复的去github上找模型然后复现模型;能够帮助团队将功能进行流程拆解,很方便的提高分布式执行效率;能够有效减少代码冗余,减少不必要负担。

Tao Luo 125 Dec 22, 2022
Official implementation of Influence-balanced Loss for Imbalanced Visual Classification in PyTorch.

Official implementation of Influence-balanced Loss for Imbalanced Visual Classification in PyTorch.

Seulki Park 70 Jan 03, 2023