A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.

Last update: Dec 12, 2022

Overview

Spatio-Temporal Dynamic Inference Network for Group Activity Recognition

The source codes for ICCV2021 Paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition.
[paper] [supplemental material] [arXiv]

If you find our work or the codebase inspiring and useful to your research, please cite

@inproceedings{yuan2021DIN,
  title={Spatio-Temporal Dynamic Inference Network for Group Activity Recognition},
  author={Yuan, Hangjie and Ni, Dong and Wang, Mang},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={7476--7485},
  year={2021}
}

Dependencies

Software Environment: Linux (CentOS 7)
Hardware Environment: NVIDIA TITAN RTX
Python 3.6
PyTorch 1.2.0, Torchvision 0.4.0
RoIAlign for Pytorch

Prepare Datasets

Download publicly available datasets from following links: Volleyball dataset and Collective Activity dataset.
Unzip the dataset file into data/volleyball or data/collective.
Download the file tracks_normalized.pkl from cvlab-epfl/social-scene-understanding and put it into data/volleyball/videos

Using Docker

Checkout repository and cd PROJECT_PATH
Build the Docker container

docker build -t din_gar https://github.com/JacobYuan7/DIN_GAR.git#main

Run the Docker container

docker run --shm-size=2G -v data/volleyball:/opt/DIN_GAR/data/volleyball -v result:/opt/DIN_GAR/result --rm -it din_gar

--shm-size=2G: To prevent ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm)., you have to extend the container's shared memory size. Alternatively: --ipc=host
-v data/volleyball:/opt/DIN_GAR/data/volleyball: Makes the host's folder data/volleyball available inside the container at /opt/DIN_GAR/data/volleyball
-v result:/opt/DIN_GAR/result: Makes the host's folder result available inside the container at /opt/DIN_GAR/result
-it & --rm: Starts the container with an interactive session (PROJECT_PATH is /opt/DIN_GAR) and removes the container after closing the session.
din_gar the name/tag of the image
optional: --gpus='"device=7"' restrict the GPU devices the container can access.

Get Started

Train the Base Model: Fine-tune the base model for the dataset.

# Volleyball dataset
cd PROJECT_PATH 
python scripts/train_volleyball_stage1.py

# Collective Activity dataset
cd PROJECT_PATH 
python scripts/train_collective_stage1.py

Train with the reasoning module: Append the reasoning modules onto the base model to get a reasoning model.
1. Volleyball dataset
  - DIN
```
python scripts/train_volleyball_stage2_dynamic.py
```
  - lite DIN
    We can run DIN in lite version by setting cfg.lite_dim = 128 in scripts/train_volleyball_stage2_dynamic.py.
```
python scripts/train_volleyball_stage2_dynamic.py
```
  - ST-factorized DIN
    We can run ST-factorized DIN by setting cfg.ST_kernel_size = [(1,3),(3,1)] and cfg.hierarchical_inference = True.
    
    Note that if you set cfg.hierarchical_inference = False, cfg.ST_kernel_size = [(1,3),(3,1)] and cfg.num_DIN = 2, then multiple interaction fields run in parallel.
```
python scripts/train_volleyball_stage2_dynamic.py
```
  Other model re-implemented by us according to their papers or publicly available codes:
  - AT
```
python scripts/train_volleyball_stage2_at.py
```
  - PCTDM
```
python scripts/train_volleyball_stage2_pctdm.py
```
  - SACRF
```
python scripts/train_volleyball_stage2_sacrf_biute.py
```
  - ARG
```
python scripts/train_volleyball_stage2_arg.py
```
  - HiGCIN
```
python scripts/train_volleyball_stage2_higcin.py
```
2. Collective Activity dataset
  - DIN
```
python scripts/train_collective_stage2_dynamic.py
```
  - DIN lite
    We can run DIN in lite version by setting 'cfg.lite_dim = 128' in 'scripts/train_collective_stage2_dynamic.py'.
```
python scripts/train_collective_stage2_dynamic.py
```

Another work done by us, solving GAR from the perspective of incorporating visual context, is also available.

@inproceedings{yuan2021visualcontext,
  title={Learning Visual Context for Group Activity Recognition},
  author={Yuan, Hangjie and Ni, Dong},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={35},
  number={4},
  pages={3261--3269},
  year={2021}
}

A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.

Related tags

Overview

Spatio-Temporal Dynamic Inference Network for Group Activity Recognition

Dependencies

Prepare Datasets

Using Docker

Get Started

Owner

Combining Latent Space and Structured Kernels for Bayesian Optimization over Combinatorial Spaces

Fedlearn支持前沿算法研发的Python工具库 | Fedlearn algorithm toolkit for researchers

Faster RCNN pytorch windows

PyTorch reimplementation of hand-biomechanical-constraints (ECCV2020)

A curated list of the latest breakthroughs in AI (in 2021) by release date with a clear video explanation, link to a more in-depth article, and code.

Implementations for the ICLR-2021 paper: SEED: Self-supervised Distillation For Visual Representation.

Hybrid Neural Fusion for Full-frame Video Stabilization

Dynamic Token Normalization Improves Vision Transformers

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)

This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in Eurographics 2021

Automatic learning-rate scheduler

We have made you a wrapper you can't refuse

Geometric Vector Perceptrons --- a rotation-equivariant GNN for learning from biomolecular structure

AWS documentation corpus for zero-shot open-book question answering.

PyTorch implementation of a collections of scalable Video Transformer Benchmarks.

Efficient Speech Processing Tookit for Automatic Speaker Recognition

4K videos with annotated masks in our ICCV2021 paper 'Internal Video Inpainting by Implicit Long-range Propagation'.

This is a Pytorch implementation of paper: DropEdge: Towards Deep Graph Convolutional Networks on Node Classification

Repositorio de los Laboratorios de Análisis Numérico / Análisis Numérico I de FAMAF, UNC.

Official Pytorch implementation of "CLIPstyler:Image Style Transfer with a Single Text Condition"