Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Last update: Dec 06, 2022

Related tags

Overview

Head Detector

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection module can be installed using pip in order to be able to plug-and-play with HeadHunter-T.

Requirements

Nvidia Driver >= 418
Cuda 10.0 and compaitible CudNN
Python packages : To install the required python packages; conda env create -f head_detection.yml.
Use the anaconda environment head_detection by activating it, source activate head_detection or conda activate head_detection.
Alternatively pip can be used to install required packages using pip install -r requirements.txt or update your existing environment with the aforementioned yml file.

Training

To train a model, define environment variable NGPU, config file and use the following command

$python -m torch.distributed.launch --nproc_per_node=$NGPU --use_env train.py --cfg_file config/config_chuman.yaml --world_size $NGPU --num_workers 4

Training is currently supported over (a) ScutHead dataset (b) CrowdHuman + ScutHead combined, (c) Our proposed CroHD dataset. This can be mentioned in the config file.
To train the model, config files must be defined. More details about the config files are mentioned in the section below

Evaluation and Testing

Unlike the training, testing and evaluation does not have a config file. Rather, all the parameters are set as argument variable while executing the code. Refer to the respective files, evaluate.py and test.py.
evaluate.py evaluates over the validation/test set using AP, MMR, F1, MODA and MODP metrics.
test.py runs the detector over a "bunch of images" in the testing set for qualitative evaluation.

Config file

A config file is necessary for all training. It's built to ease the number of arg variable passed during each execution. Each sub-sections are as elaborated below.

DATASET
1. Set the base_path as the parent directory where the dataset is situated at.
2. Train and Valid are .txt files that contains relative path to respective images from the base_path defined above and their corresponding Ground Truth in (x_min, y_min, x_max, y_max) format. Generation files for the three datasets can be seen inside data directory. For example,
```
/path/to/image.png
x_min_1, y_min_1, x_max_1, y_max_1
x_min_2, y_min_2, x_max_2, y_max_2
x_min_3, y_min_3, x_max_3, y_max_3
.
.
.
```
1. mean_std are RGB means and stdev of the training dataset. If not provided, can be computed prior to the start of the training
TRAINING
1. Provide pretrained_model and corresponding start_epoch for resuming.
2. milestones are epoch at which the learning rates are set to 0.1 * lr.
3. only_backbone option loads just the Resnet backbone and not the head. Not applicable for mobilenet.
NETWORK
1. The mentioned parameters are as described in experiment section of the paper.
2. When using median_anchors, the anchors have to be defined in anchors.py.
3. We experimented with mobilenet, resnet50 and resnet150 as alternative backbones. This experiment was not reported in the paper due to space constraints. We found the accuracy to significantly decrease with mobilenet but resnet50 and resnet150 yielded an almost same performance.
4. We also briefly experimented with Deformable Convolutions but again didn't see noticable improvements in performance. The code we used are available in this repository.

Note :

This codebase borrows a noteable portion from pytorch-vision owing to the fact some of their modules cannot be "imported" as a package.

Citation :

@InProceedings{Sundararaman_2021_CVPR,
    author    = {Sundararaman, Ramana and De Almeida Braga, Cedric and Marchand, Eric and Pettre, Julien},
    title     = {Tracking Pedestrian Heads in Dense Crowd},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {3865-3875}
}

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Related tags

Overview

Head Detector

Requirements

Training

Evaluation and Testing

Config file

Note :

Citation :

Owner

Ramana Subramanyam

Create single line SVG illustrations from your pictures

Implementation of EAST scene text detector in Keras

A simple demo program for using OpenCV on Android

Document Layout Analysis Projects

Zoom , GoogleMeets에서 Vtuber 데뷔하기

Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

Page to PAGE Layout Analysis Tool

Detect handwritten words in a text-line (classic image processing method).

Automatic Number Plate Recognition (ANPR) is a highly accurate system capable of reading vehicle number plates without human intervention

A selectional auto-encoder approach for document image binarization

OpenCVを用いたカメラキャリブレーションのサンプルです。2021/06/21時点でPython実装のある3種類(通常カメラ向け、魚眼レンズ向け(fisheyeモジュール)、全方位カメラ向け(omnidirモジュール))について用意しています。

Um RPG de texto orientado a objetos.

Face Anonymizer - FaceAnonApp v1.0

The open source extract transaction infomation by using OCR.

Virtual Zoom Gesture using OpenCV

This is the implementation of the paper "Gated Recurrent Convolution Neural Network for OCR"

Omdena-abuja-anpd - Automatic Number Plate Detection for the security of lives and properties using Computer Vision.

Fast style transfer

Discord QR Scam Code Generator + Token grab mobile device.

This is a c++ project deploying a deep scene text reading pipeline with tensorflow. It reads text from natural scene images. It uses frozen tensorflow graphs. The detector detect scene text locations. The recognizer reads word from each detected bounding box.