Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Overview

Head Detector

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection module can be installed using pip in order to be able to plug-and-play with HeadHunter-T.

Requirements

  1. Nvidia Driver >= 418

  2. Cuda 10.0 and compaitible CudNN

  3. Python packages : To install the required python packages; conda env create -f head_detection.yml.

  4. Use the anaconda environment head_detection by activating it, source activate head_detection or conda activate head_detection.

  5. Alternatively pip can be used to install required packages using pip install -r requirements.txt or update your existing environment with the aforementioned yml file.

Training

  1. To train a model, define environment variable NGPU, config file and use the following command

$python -m torch.distributed.launch --nproc_per_node=$NGPU --use_env train.py --cfg_file config/config_chuman.yaml --world_size $NGPU --num_workers 4

  1. Training is currently supported over (a) ScutHead dataset (b) CrowdHuman + ScutHead combined, (c) Our proposed CroHD dataset. This can be mentioned in the config file.

  2. To train the model, config files must be defined. More details about the config files are mentioned in the section below

Evaluation and Testing

  1. Unlike the training, testing and evaluation does not have a config file. Rather, all the parameters are set as argument variable while executing the code. Refer to the respective files, evaluate.py and test.py.
  2. evaluate.py evaluates over the validation/test set using AP, MMR, F1, MODA and MODP metrics.
  3. test.py runs the detector over a "bunch of images" in the testing set for qualitative evaluation.

Config file

A config file is necessary for all training. It's built to ease the number of arg variable passed during each execution. Each sub-sections are as elaborated below.

  1. DATASET

    1. Set the base_path as the parent directory where the dataset is situated at.
    2. Train and Valid are .txt files that contains relative path to respective images from the base_path defined above and their corresponding Ground Truth in (x_min, y_min, x_max, y_max) format. Generation files for the three datasets can be seen inside data directory. For example,
    /path/to/image.png
    x_min_1, y_min_1, x_max_1, y_max_1
    x_min_2, y_min_2, x_max_2, y_max_2
    x_min_3, y_min_3, x_max_3, y_max_3
    .
    .
    .
    
    1. mean_std are RGB means and stdev of the training dataset. If not provided, can be computed prior to the start of the training
  2. TRAINING

    1. Provide pretrained_model and corresponding start_epoch for resuming.
    2. milestones are epoch at which the learning rates are set to 0.1 * lr.
    3. only_backbone option loads just the Resnet backbone and not the head. Not applicable for mobilenet.
  3. NETWORK

    1. The mentioned parameters are as described in experiment section of the paper.
    2. When using median_anchors, the anchors have to be defined in anchors.py.
    3. We experimented with mobilenet, resnet50 and resnet150 as alternative backbones. This experiment was not reported in the paper due to space constraints. We found the accuracy to significantly decrease with mobilenet but resnet50 and resnet150 yielded an almost same performance.
    4. We also briefly experimented with Deformable Convolutions but again didn't see noticable improvements in performance. The code we used are available in this repository.

Note :

This codebase borrows a noteable portion from pytorch-vision owing to the fact some of their modules cannot be "imported" as a package.

Citation :

@InProceedings{Sundararaman_2021_CVPR,
    author    = {Sundararaman, Ramana and De Almeida Braga, Cedric and Marchand, Eric and Pettre, Julien},
    title     = {Tracking Pedestrian Heads in Dense Crowd},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {3865-3875}
}
Owner
Ramana Sundararaman
Ramana Sundararaman
img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation Figure 1: We estimate the 6DoF rigid transformation of a 3D face (rendered in si

Vítor Albiero 519 Dec 29, 2022
Tensorflow implementation of "BEGAN: Boundary Equilibrium Generative Adversarial Networks"

BEGAN in Tensorflow Tensorflow implementation of BEGAN: Boundary Equilibrium Generative Adversarial Networks. Requirements Python 2.7 or 3.x Pillow tq

Taehoon Kim 922 Dec 21, 2022
Implementation of the "PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences" paper.

PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences Introduction Point cloud sequences are irregular and unordered in the spatial dimen

Hehe Fan 63 Dec 09, 2022
Elevation Mapping on GPU.

Elevation Mapping cupy Overview This is a ros package of elevation mapping on GPU. Code are written in python and uses cupy for GPU calculation. * pla

Robotic Systems Lab - Legged Robotics at ETH Zürich 183 Dec 19, 2022
Continuous Security Group Rule Change Detection & Response at scale

Introduction Get notified of Security Group Changes across all AWS Accounts & Regions in an AWS Organization, with the ability to respond/revert those

Raajhesh Kannaa Chidambaram 3 Aug 13, 2022
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

Super Resolution Examples We run this script under TensorFlow 2.0 and the TensorLayer2.0+. For TensorLayer 1.4 version, please check release. 🚀 🚀 🚀

TensorLayer Community 2.9k Jan 08, 2023
Official code of paper "PGT: A Progressive Method for Training Models on Long Videos" on CVPR2021

PGT Code for paper PGT: A Progressive Method for Training Models on Long Videos. Install Run pip install -r requirements.txt. Run python setup.py buil

Bo Pang 27 Mar 30, 2022
Eth brownie struct encoding example

eth-brownie struct encoding example Overview This repository contains an example of encoding a struct, so that it can be used in a function call, usin

Ittai Svidler 2 Mar 04, 2022
"Inductive Entity Representations from Text via Link Prediction" @ The Web Conference 2021

Inductive entity representations from text via link prediction This repository contains the code used for the experiments in the paper "Inductive enti

Daniel Daza 45 Jan 09, 2023
DumpSMBShare - A script to dump files and folders remotely from a Windows SMB share

DumpSMBShare A script to dump files and folders remotely from a Windows SMB shar

Podalirius 178 Jan 06, 2023
Causal Influence Detection for Improving Efficiency in Reinforcement Learning

Causal Influence Detection for Improving Efficiency in Reinforcement Learning This repository contains the code release for the paper "Causal Influenc

Autonomous Learning Group 21 Nov 29, 2022
Clustering with variational Bayes and population Monte Carlo

pypmc pypmc is a python package focusing on adaptive importance sampling. It can be used for integration and sampling from a user-defined target densi

45 Feb 06, 2022
Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Balancing Training for Multilingual Neural Machine Translation Implementation of the paper Balancing Training for Multilingual Neural Machine Translat

Xinyi Wang 21 May 18, 2022
Visualizing Yolov5's layers using GradCam

YOLO-V5 GRADCAM I constantly desired to know to which part of an object the object-detection models pay more attention. So I searched for it, but I di

Pooya Mohammadi Kazaj 200 Jan 01, 2023
Qcover is an open source effort to help exploring combinatorial optimization problems in Noisy Intermediate-scale Quantum(NISQ) processor.

Qcover is an open source effort to help exploring combinatorial optimization problems in Noisy Intermediate-scale Quantum(NISQ) processor. It is devel

33 Nov 11, 2022
official Pytorch implementation of ICCV 2021 paper FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.

FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting By Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu

77 Dec 27, 2022
Hard cater examples from Hopper ICLR paper

CATER-h Honglu Zhou*, Asim Kadav, Farley Lai, Alexandru Niculescu-Mizil, Martin Renqiang Min, Mubbasir Kapadia, Hans Peter Graf (*Contact: honglu.zhou

NECLA ML Group 6 May 11, 2021
Fully Automatic Page Turning on Real Scores

Fully Automatic Page Turning on Real Scores This repository contains the corresponding code for our extended abstract Henkel F., Schwaiger S. and Widm

Florian Henkel 7 Jan 02, 2022
[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Reference-based Video Super-Resolution (RefVSR) Official PyTorch Implementation of the CVPR 2022 Paper Project | arXiv | RealMCVSR Dataset This repo c

Junyong Lee 151 Dec 30, 2022
A Pytorch loader for MVTecAD dataset.

MVTecAD A Pytorch loader for MVTecAD dataset. It strictly follows the code style of common Pytorch datasets, such as torchvision.datasets.CIFAR10. The

Jiyuan 1 Dec 27, 2021