PyTorch implementations of the paper: "DR.VIC: Decomposition and Reasoning for Video Individual Counting, CVPR, 2022"

Last update: Nov 22, 2022

Related tags

Overview

DRNet for Video Indvidual Counting (CVPR 2022)

Introduction

This is the official PyTorch implementation of paper: DR.VIC: Decomposition and Reasoning for Video Individual Counting. Different from the single image counting methods, it counts the total number of the pedestrians in a video sequence with a person in different frames only being calculated once. DRNet decomposes this new task to estimate the initial crowd number in the first frame and integrate differential crowd numbers in a set of following image pairs (namely current frame and preceding frame).

Catalog

Getting started

preparatoin

Clone this repo in the directory (Root/DRNet):

Install dependencies. We use python 3.7 and pytorch >= 1.6.0 : http://pytorch.org.

conda create -n DRNet python=3.7
conda activate DRNet
conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=10.2 -c pytorch
cd ${DRNet}
pip install -r requirements.txt

PreciseRoIPooling for extracting the feature descriptors

Note: the PreciseRoIPooling [1] module is included in the repo, but it's likely to have some problems when running the code:
1. If you are prompted to install ninja, the following commands will help you.
```
wget https://github.com/ninja-build/ninja/releases/download/v1.8.2/ninja-linux.zip
sudo unzip ninja-linux.zip -d /usr/local/bin/
sudo update-alternatives --install /usr/bin/ninja ninja /usr/local/bin/ninja 1 --force 
```
2. If you encounter errors when compiling the PreciseRoIPooling, you can look up the original repo's issues for help.
Datasets
- HT21 dataset: Download CroHD dataset from this link. Unzip HT21.zip and place HT21 into the folder (Root/dataset/).
- SenseCrowd dataset: To be updated when it is released.
- Download the lists of train/val/test sets at link: dataset., and place them to each dataset folder, respectively.

Training

Check some parameters in config.py before training,

Use __C.DATASET = 'HT21' to set the dataset (default: HT21).
Use __C.GPU_ID = '0' to set the GPU.
Use __C.MAX_EPOCH = 20 to set the number of the training epochs (default:20).
Use __C.EXP_PATH = os.path.join('./exp', __C.DATASET) to set the dictionary for saving the code, weights, and resume point.

Check other parameters (TRAIN_BATCH_SIZE, TRAIN_SIZE etc.) in the Root/DRNet/datasets/setting in case your GPU's memory is not support for the default setting.

run python train.py.

Tips: The training process takes ~10 hours on HT21 dataset with one TITAN RTX (24GB Memory).

Testing

To reproduce the performance, download the pre-trained models and then place pretrained_models folder to Root/DRNet/model/

for HT21:
- Run python test_HT21.py.
for SenseCrowd:
- Run python test_SENSE.py. Then the output file (*_SENSE_cnt.py) will be generated.

Performance

The results on HT21 and SenseCrowd.

HT21 dataset

Method	CroHD11~CroHD15	MAE/MSE/MRAE(%)
Paper: VGG+FPN [2,3]	164.6/1075.5/752.8/784.5/382.3	141.1/192.3/27.4
This Repo's Reproduction: VGG+FPN [2,3]	138.4/1017.5/623.9/659.8/348.5	160.7/217.3/25.1

SenseCrowd dataset

Method	MAE/MSE/MRAE(%)	MIAE/MOAE	D0~D4 (for MAE)
Paper: VGG+FPN [2,3]	12.3/24.7/12.7	1.98/2.01	4.1/8.0/23.3/50.0/77.0
This Repo's Reproduction: VGG+FPN [2,3]	11.7/24.6/11.7	1.99/1.88	3.6/6.8/22.4/42.6/85.2

Video Demo

Please visit bilibili or YouTube to watch the video demonstration.

References

Acquisition of Localization Confidence for Accurate Object Detection, ECCV, 2018.
Very Deep Convolutional Networks for Large-scale Image Recognition, arXiv, 2014.
Feature Pyramid Networks for Object Detection, CVPR, 2017.

Citation

If you find this project is useful for your research, please cite:

@article{han2022drvic,
  title={DR.VIC: Decomposition and Reasoning for Video Individual Counting},
  author={Han, Tao, Bai Lei, Gao, Junyu, Qi Wang, and Ouyang  Wanli},
  booktitle={CVPR},
  year={2022}
}

Acknowledgement

The released PyTorch training script borrows some codes from the C^3 Framework and SuperGlue repositories. If you think this repo is helpful for your research, please consider cite them.

PyTorch implementations of the paper: "DR.VIC: Decomposition and Reasoning for Video Individual Counting, CVPR, 2022"

Related tags

Overview

DRNet for Video Indvidual Counting (CVPR 2022)

Introduction

Catalog

Getting started

preparatoin

Training

Testing

Performance

Video Demo

References

Citation

Acknowledgement

Owner

tao han

Code repository for paper `Skeleton Merger: an Unsupervised Aligned Keypoint Detector`.

Off-policy continuous control in PyTorch, with RDPG, RTD3 & RSAC

SARS-Cov-2 Recombinant Finder for fasta sequences

An Extendible (General) Continual Learning Framework based on Pytorch - official codebase of Dark Experience for General Continual Learning

A PyTorch implementation of Mugs proposed by our paper "Mugs: A Multi-Granular Self-Supervised Learning Framework".

Code accompanying the NeurIPS 2021 paper "Generating High-Quality Explanations for Navigation in Partially-Revealed Environments"

Automatically creates genre collections for your Plex media

TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction.

Aws-machine-learning-university-accelerated-tab - Machine Learning University: Accelerated Tabular Data Class

Official code of the paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" (CVPR 2021)

A collection of awesome resources image-to-image translation.

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

Open source implementation of "A Self-Supervised Descriptor for Image Copy Detection" (SSCD).

A parallel framework for population-based multi-agent reinforcement learning.

Interpolation-based reduced-order models

Event sourced bank - A wide-and-shallow example using the Python event sourcing library

Official PyTorch implementation of "Improving Face Recognition with Large AgeGaps by Learning to Distinguish Children" (BMVC 2021)

Rule Extraction Methods for Interactive eXplainability

The first machine learning framework that encourages learning ML concepts instead of memorizing class functions.

Qcover is an open source effort to help exploring combinatorial optimization problems in Noisy Intermediate-scale Quantum(NISQ) processor.