The official codes for the ICCV2021 Oral presentation "Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework"

Overview

P2PNet (ICCV2021 Oral Presentation)

This repository contains codes for the official implementation in PyTorch of P2PNet as described in Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework.

An brief introduction of P2PNet can be found at 机器之心 (almosthuman).

The codes is tested with PyTorch 1.5.0. It may not run with other versions.

Visualized demos for P2PNet

The network

The overall architecture of the P2PNet. Built upon the VGG16, it firstly introduce an upsampling path to obtain fine-grained feature map. Then it exploits two branches to simultaneously predict a set of point proposals and their confidence scores.

Comparison with state-of-the-art methods

The P2PNet achieved state-of-the-art performance on several challenging datasets with various densities.

Methods Venue SHTechPartA
MAE/MSE
SHTechPartB
MAE/MSE
UCF_CC_50
MAE/MSE
UCF_QNRF
MAE/MSE
CAN CVPR'19 62.3/100.0 7.8/12.2 212.2/243.7 107.0/183.0
Bayesian+ ICCV'19 62.8/101.8 7.7/12.7 229.3/308.2 88.7/154.8
S-DCNet ICCV'19 58.3/95.0 6.7/10.7 204.2/301.3 104.4/176.1
SANet+SPANet ICCV'19 59.4/92.5 6.5/9.9 232.6/311.7 -/-
DUBNet AAAI'20 64.6/106.8 7.7/12.5 243.8/329.3 105.6/180.5
SDANet AAAI'20 63.6/101.8 7.8/10.2 227.6/316.4 -/-
ADSCNet CVPR'20 55.4/97.7 6.4/11.3 198.4/267.3 71.3/132.5
ASNet CVPR'20 57.78/90.13 -/- 174.84/251.63 91.59/159.71
AMRNet ECCV'20 61.59/98.36 7.02/11.00 184.0/265.8 86.6/152.2
AMSNet ECCV'20 56.7/93.4 6.7/10.2 208.4/297.3 101.8/163.2
DM-Count NeurIPS'20 59.7/95.7 7.4/11.8 211.0/291.5 85.6/148.3
Ours - 52.74/85.06 6.25/9.9 172.72/256.18 85.32/154.5

Comparison on the NWPU-Crowd dataset.

Methods MAE[O] MSE[O] MAE[L] MAE[S]
MCNN 232.5 714.6 220.9 1171.9
SANet 190.6 491.4 153.8 716.3
CSRNet 121.3 387.8 112.0 522.7
PCC-Net 112.3 457.0 111.0 777.6
CANNet 110.0 495.3 102.3 718.3
Bayesian+ 105.4 454.2 115.8 750.5
S-DCNet 90.2 370.5 82.9 567.8
DM-Count 88.4 388.6 88.0 498.0
Ours 77.44 362 83.28 553.92

The overall performance for both counting and localization.

nAP$_{\delta}$ SHTechPartA SHTechPartB UCF_CC_50 UCF_QNRF NWPU_Crowd
$\delta=0.05$ 10.9% 23.8% 5.0% 5.9% 12.9%
$\delta=0.25$ 70.3% 84.2% 54.5% 55.4% 71.3%
$\delta=0.50$ 90.1% 94.1% 88.1% 83.2% 89.1%
$\delta={{0.05:0.05:0.50}}$ 64.4% 76.3% 54.3% 53.1% 65.0%

Comparison for the localization performance in terms of F1-Measure on NWPU.

Method F1-Measure Precision Recall
FasterRCNN 0.068 0.958 0.035
TinyFaces 0.567 0.529 0.611
RAZ 0.599 0.666 0.543
Crowd-SDNet 0.637 0.651 0.624
PDRNet 0.653 0.675 0.633
TopoCount 0.692 0.683 0.701
D2CNet 0.700 0.741 0.662
Ours 0.712 0.729 0.695

Installation

  • Clone this repo into a directory named P2PNET_ROOT
  • Organize your datasets as required
  • Install Python dependencies. We use python 3.6.5 and pytorch 1.5.0
pip install -r requirements.txt

Organize the counting dataset

We use a list file to collect all the images and their ground truth annotations in a counting dataset. When your dataset is organized as recommended in the following, the format of this list file is defined as:

train/scene01/img01.jpg train/scene01/img01.txt
train/scene01/img02.jpg train/scene01/img02.txt
...
train/scene02/img01.jpg train/scene02/img01.txt

Dataset structures:

DATA_ROOT/
        |->train/
        |    |->scene01/
        |    |->scene02/
        |    |->...
        |->test/
        |    |->scene01/
        |    |->scene02/
        |    |->...
        |->train.list
        |->test.list

DATA_ROOT is your path containing the counting datasets.

Annotations format

For the annotations of each image, we use a single txt file which contains one annotation per line. Note that indexing for pixel values starts at 0. The expected format of each line is:

x1 y1
x2 y2
...

Training

The network can be trained using the train.py script. For training on SHTechPartA, use

CUDA_VISIBLE_DEVICES=0 python train.py --data_root $DATA_ROOT \
    --dataset_file SHHA \
    --epochs 3500 \
    --lr_drop 3500 \
    --output_dir ./logs \
    --checkpoints_dir ./weights \
    --tensorboard_dir ./logs \
    --lr 0.0001 \
    --lr_backbone 0.00001 \
    --batch_size 8 \
    --eval_freq 1 \
    --gpu_id 0

By default, a periodic evaluation will be conducted on the validation set.

Testing

A trained model (with an MAE of 51.96) on SHTechPartA is available at "./weights", run the following commands to launch a visualization demo:

CUDA_VISIBLE_DEVICES=0 python run_test.py --weight_path ./weights/SHTechA.pth --output_dir ./logs/

Acknowledgements

  • Part of codes are borrowed from the C^3 Framework.
  • We refer to DETR to implement our matching strategy.

Citing P2PNet

If you find P2PNet is useful in your project, please consider citing us:

@inproceedings{song2021rethinking,
  title={Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework},
  author={Song, Qingyu and Wang, Changan and Jiang, Zhengkai and Wang, Yabiao and Tai, Ying and Wang, Chengjie and Li, Jilin and Huang, Feiyue and Wu, Yang},
  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2021}
}

Related works from Tencent Youtu Lab

  • [AAAI2021] To Choose or to Fuse? Scale Selection for Crowd Counting. (paper link & codes)
  • [ICCV2021] Uniformity in Heterogeneity: Diving Deep into Count Interval Partition for Crowd Counting. (paper link & codes)
Owner
Tencent YouTu Research
Tencent YouTu Research
Swapping face using Face Mesh with TensorFlow Lite

Swapping face using Face Mesh with TensorFlow Lite

iwatake 17 Apr 26, 2022
Video Swin Transformer - PyTorch

Video-Swin-Transformer-Pytorch This repo is a simple usage of the official implementation "Video Swin Transformer". Introduction Video Swin Transforme

Haofan Wang 116 Dec 20, 2022
Este conversor criará a medida exata para sua receita de capuccino gelado da grandiosa Rafaella Ballerini!

ConversorDeMedidas_CapuccinoGelado Este conversor criará a medida exata para sua receita de capuccino gelado da grandiosa Rafaella Ballerini! Requirem

Arthur Ottoni Ribeiro 48 Nov 15, 2022
simple demo codes for Learning to Teach with Dynamic Loss Functions

Learning to Teach with Dynamic Loss Functions This repo contains the simple demo for the NeurIPS-18 paper: Learning to Teach with Dynamic Loss Functio

Lijun Wu 15 Dec 30, 2021
This is an official pytorch implementation of Lite-HRNet: A Lightweight High-Resolution Network.

Lite-HRNet: A Lightweight High-Resolution Network Introduction This is an official pytorch implementation of Lite-HRNet: A Lightweight High-Resolution

HRNet 675 Dec 25, 2022
Unofficial Tensorflow-Keras implementation of Fastformer based on paper [Fastformer: Additive Attention Can Be All You Need](https://arxiv.org/abs/2108.09084).

Fastformer-Keras Unofficial Tensorflow-Keras implementation of Fastformer based on paper Fastformer: Additive Attention Can Be All You Need. Tensorflo

Yam Peleg 10 Jan 30, 2022
A big endian Gentoo port developed on a Pine64.org RockPro64

Gentoo-aarch64_be A big endian Gentoo port developed on a Pine64.org RockPro64 The endian wars are over... little endian won. As a result, it is incre

Rory Bolt 6 Dec 07, 2022
constructing maps of intellectual influence from publication data

Influencemap Project @ ANU Influence in the academic communities has been an area of interest for researchers. This can be seen in the popularity of a

CS Metrics 13 Jun 18, 2022
Learning Saliency Propagation for Semi-supervised Instance Segmentation

Learning Saliency Propagation for Semi-supervised Instance Segmentation PyTorch Implementation This repository contains: the PyTorch implementation of

Berkeley DeepDrive 68 Oct 18, 2022
A Home Assistant custom component for Lobe. Lobe is an AI tool that can classify images.

Lobe This is a Home Assistant custom component for Lobe. Lobe is an AI tool that can classify images. This component lets you easily use an exported m

Kendell R 4 Feb 28, 2022
Stitch it in Time: GAN-Based Facial Editing of Real Videos

STIT - Stitch it in Time [Project Page] Stitch it in Time: GAN-Based Facial Edit

1.1k Jan 04, 2023
A JAX-based research framework for writing differentiable numerical simulators with arbitrary discretizations

jaxdf - JAX-based Discretization Framework Overview | Example | Installation | Documentation ⚠️ This library is still in development. Breaking changes

UCL Biomedical Ultrasound Group 65 Dec 23, 2022
SAMO: Streaming Architecture Mapping Optimisation

SAMO: Streaming Architecture Mapping Optimiser The SAMO framework provides a method of optimising the mapping of a Convolutional Neural Network model

Alexander Montgomerie-Corcoran 20 Dec 10, 2022
Repository for GNSS-based position estimation using a Deep Neural Network

Code repository accompanying our work on 'Improving GNSS Positioning using Neural Network-based Corrections'. In this paper, we present a Deep Neural

32 Dec 13, 2022
A PyTorch implementation of Implicit Q-Learning

IQL-PyTorch This repository houses a minimal PyTorch implementation of Implicit Q-Learning (IQL), an offline reinforcement learning algorithm, along w

Garrett Thomas 30 Dec 12, 2022
Hierarchical Aggregation for 3D Instance Segmentation (ICCV 2021)

HAIS Hierarchical Aggregation for 3D Instance Segmentation (ICCV 2021) by Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang*. (*) Corresp

Hust Visual Learning Team 145 Jan 05, 2023
Small-bets - Ergodic Experiment With Python

Ergodic Experiment Based on this video. Run this experiment with this command: p

Michael Brant 3 Jan 11, 2022
BabelCalib: A Universal Approach to Calibrating Central Cameras. In ICCV (2021)

BabelCalib: A Universal Approach to Calibrating Central Cameras This repository contains the MATLAB implementation of the BabelCalib calibration frame

Yaroslava Lochman 55 Dec 30, 2022
Autonomous Driving on Curvy Roads without Reliance on Frenet Frame: A Cartesian-based Trajectory Planning Method

C++/ROS Source Codes for "Autonomous Driving on Curvy Roads without Reliance on Frenet Frame: A Cartesian-based Trajectory Planning Method" published in IEEE Trans. Intelligent Transportation Systems

Bai Li 88 Dec 23, 2022
This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Developed By Google!

Machine Learning Hand Detector This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Dev

Popstar Idhant 3 Feb 25, 2022