PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"

Overview

IIM - Crowd Localization


This repo is the official implementation of paper: Learning Independent Instance Maps for Crowd Localization. The code is developed based on C3F. framework

Progress

  • Testing Code (2020.12.10)
  • Training Code
    • NWPU (2020.12.14)
    • JHU (2021.01.05)
    • UCF-QNRF (2020.12.30)
    • ShanghaiTech Part A/B (2020.12.29)
    • FDST (2020.12.30)
  • scale information for UCF-QNRF and ShanghaiTech Part A/B (2021.01.07)

Getting Started

Preparation

  • Prerequisites

    • Python 3.7
    • Pytorch 1.6: http://pytorch.org .
    • other libs in requirements.txt, run pip install -r requirements.txt.
  • Code

  • Datasets

    • Download NWPU-Crowd dataset from this link.

    • Unzip *zip files in turns and place images_part* into the same folder (Root/ProcessedData/NWPU/images).

    • Download the processing labels and val gt file from this link. Place them into Root/ProcessedData/NWPU/masks and Root/ProcessedData/NWPU, respectively.

    • If you want to reproduce the results on Shanghai Tech Part A/B , UCF-QNRF, and JHU datasets, you can follow the instructions in DATA.md to setup the datasets.

    • Finally, the folder tree is below:

   -- ProcessedData
   	|-- NWPU
   		|-- images
   		|   |-- 0001.jpg
   		|   |-- 0002.jpg
   		|   |-- ...
   		|   |-- 5109.jpg
   		|-- masks
   		|   |-- 0001.png
   		|   |-- 0002.png
   		|   |-- ...
   		|   |-- 3609.png
   		|-- train.txt
   		|-- val.txt
   		|-- test.txt
   		|-- val_gt_loc.txt
   -- PretrainedModels
     |-- hrnetv2_w48_imagenet_pretrained.pth
   -- IIM
     |-- datasets
     |-- misc
     |-- ...

Training

  • run python train.py.
  • run tensorboard --logdir=exp --port=6006.
  • The validtion records are shown as follows: val_curve
  • The sub images are the input image, GT, prediction map,localization result, and pixel-level threshold, respectively: val_curve

Tips: The training process takes ~50 hours on NWPU datasets with two TITAN RTX (48GB Memeory).

Testing and Submitting

  • Modify some key parameters in test.py:
    • netName.
    • model_path.
  • Run python test.py. Then the output file (*_*_test.txt) will be generated, which can be directly submitted to CrowdBenchmark

Visualization on the val set

  • Modify some key parameters in test.py:
    • test_list = 'val.txt'
    • netName.
    • model_path.
  • Run python test.py. Then the output file (*_*_val.txt) will be generated.
  • Modify some key parameters in vis4val.py:
    • pred_file.
  • Run python vis4val.py.

Performance

The results (F1, Pre., Rec. under the sigma_l) and pre-trained models on NWPU val set, UCF-QNRF, SHT A, SHT B, and FDST:

Method NWPU val UCF-QNRF SHT A
Paper: VGG+FPN [2,3] 77.0/80.2/74.1 68.8/78.2/61.5 72.5/72.6/72.5
This Repo's Reproduction: VGG+FPN [2,3] 77.1/82.5/72.3 67.8/75.7/61.5 71.6/75.9/67.8
Paper: HRNet [1] 80.2/84.1/76.6 72.0/79.3/65.9 73.9/79.8/68.7
This Repo's Reproduction: HRNet [1] 79.8/83.4/76.5 72.0/78.7/66.4 76.1/79.1/73.3
Method SHT B FDST JHU
Paper: VGG+FPN [2,3] 80.2/84.9/76.0 93.1/92.7/93.5 -
This Repo's Reproduction: VGG+FPN [2,3] 81.7/88.5/75.9 93.9/94.7/93.1 61.8/73.2/53.5
Paper: HRNet [1] 86.2/90.7/82.1 95.5/95.3/95.8 62.5/74.0/54.2
This Repo's Reproduction: HRNet [1] 86.0/91.5/81.0 95.7/96.9 /94.4 64.0/73.3/56.8

References

  1. Deep High-Resolution Representation Learning for Visual Recognition, T-PAMI, 2019.
  2. Very Deep Convolutional Networks for Large-scale Image Recognition, arXiv, 2014.
  3. Feature Pyramid Networks for Object Detection, CVPR, 2017.

About the leaderboard on the test set, please visit Crowd benchmark. Our submissions are the IIM(HRNet) and IIM (VGG16).

Video Demo

We test the pretrained HR Net model on the NWPU dataset in a real-world subway scene. Please visit bilibili or YouTube to watch the video demonstration. val_curve

Citation

If you find this project is useful for your research, please cite:

@article{gao2020learning,
  title={Learning Independent Instance Maps for Crowd Localization},
  author={Gao, Junyu and Han, Tao and Yuan, Yuan and Wang, Qi},
  journal={arXiv preprint arXiv:2012.04164},
  year={2020}
}

Our code borrows a lot from the C^3 Framework, and you may cite:

@article{gao2019c,
  title={C$^3$ Framework: An Open-source PyTorch Code for Crowd Counting},
  author={Gao, Junyu and Lin, Wei and Zhao, Bin and Wang, Dong and Gao, Chenyu and Wen, Jun},
  journal={arXiv preprint arXiv:1907.02724},
  year={2019}
}

If you use pre-trained models in this repo (HR Net, VGG, and FPN), please cite them.

Owner
tao han
tao han
Meta Self-learning for Multi-Source Domain Adaptation: A Benchmark

Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark Project | Arxiv | YouTube | | Abstract In recent years, deep learning-based methods

CVSM Group - email: <a href=[email protected]"> 188 Dec 12, 2022
Spatio-Temporal Entropy Model (STEM) for end-to-end leaned video compression.

Spatio-Temporal Entropy Model A Pytorch Reproduction of Spatio-Temporal Entropy Model (STEM) for end-to-end leaned video compression. More details can

16 Nov 28, 2022
Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021)

L1-Refinement Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021) 🙈 A more detailed readme is co

Lincedo Lab 4 Jun 09, 2021
[ICCV 2021] Excavating the Potential Capacity of Self-Supervised Monocular Depth Estimation

EPCDepth EPCDepth is a self-supervised monocular depth estimation model, whose supervision is coming from the other image in a stereo pair. Details ar

Rui Peng 110 Dec 23, 2022
(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

RDPNet IEEE TIP 2021: Regularized Densely-connected Pyramid Network for Salient Instance Segmentation PyTorch training and testing code are available.

Yu-Huan Wu 41 Oct 21, 2022
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech

STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech Keon Lee, Ky

Keon Lee 114 Dec 12, 2022
Official repository for "Orthogonal Projection Loss" (ICCV'21)

Orthogonal Projection Loss (ICCV'21) Kanchana Ranasinghe, Muzammal Naseer, Munawar Hayat, Salman Khan, & Fahad Shahbaz Khan Paper Link | Project Page

Kanchana Ranasinghe 83 Dec 26, 2022
OMAMO: orthology-based model organism selection

OMAMO: orthology-based model organism selection OMAMO is a tool that suggests the best model organism to study a biological process based on orthologo

Dessimoz Lab 5 Apr 22, 2022
Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Balancing Training for Multilingual Neural Machine Translation Implementation of the paper Balancing Training for Multilingual Neural Machine Translat

Xinyi Wang 21 May 18, 2022
This project aims to explore the deployment of Swin-Transformer based on TensorRT, including the test results of FP16 and INT8.

Swin Transformer This project aims to explore the deployment of SwinTransformer based on TensorRT, including the test results of FP16 and INT8. Introd

maggiez 87 Dec 21, 2022
Feature board for ERPNext

ERPNext Feature Board Feature board for ERPNext Development Prerequisites k3d kubectl helm bench Install K3d Cluster # export K3D_FIX_CGROUPV2=1 # use

Revant Nandgaonkar 16 Nov 09, 2022
Progressive Coordinate Transforms for Monocular 3D Object Detection

Progressive Coordinate Transforms for Monocular 3D Object Detection This repository is the official implementation of PCT. Introduction In this paper,

58 Nov 06, 2022
Code for MarioNette: Self-Supervised Sprite Learning, in NeurIPS 2021

MarioNette | Webpage | Paper | Video MarioNette: Self-Supervised Sprite Learning Dmitriy Smirnov, Michaël Gharbi, Matthew Fisher, Vitor Guizilini, Ale

Dima Smirnov 28 Nov 18, 2022
Neural models of common sense. 🤖

Unicorn on Rainbow Neural models of common sense. This repository is for the paper: Unicorn on Rainbow: A Universal Commonsense Reasoning Model on a N

AI2 60 Jan 05, 2023
The DL Streamer Pipeline Zoo is a catalog of optimized media and media analytics pipelines.

The DL Streamer Pipeline Zoo is a catalog of optimized media and media analytics pipelines. It includes tools for downloading pipelines and their dependencies and tools for measuring their performace

8 Dec 04, 2022
내가 보려고 정리한 <프로그래밍 기초 Ⅰ> / organized for me

Programming-Basics 프로그래밍 기초 Ⅰ 아카이브 Do it! 점프 투 파이썬 주차 강의주제 비고 1주차 Syllabus 2주차 자료형 - 숫자형 3주차 자료형 - 문자열형 4주차 입력과 출력 5주차 제어문 - 조건문 if 6주차 제어문 - 반복문 whil

KIMMINSEO 1 Mar 07, 2022
A python toolbox for predictive uncertainty quantification, calibration, metrics, and visualization

Website, Tutorials, and Docs    Uncertainty Toolbox A python toolbox for predictive uncertainty quantification, calibration, metrics, and visualizatio

Uncertainty Toolbox 1.4k Dec 28, 2022
Automatic labeling, conversion of different data set formats, sample size statistics, model cascade

Simple Gadget Collection for Object Detection Tasks Automatic image annotation Conversion between different annotation formats Obtain statistical info

llt 4 Aug 24, 2022
Optical Character Recognition + Instance Segmentation for russian and english languages

Распознавание рукописного текста в школьных тетрадях Соревнование, проводимое в рамках олимпиады НТО, разработанное Сбером. Платформа ODS. Результаты

Gerasimov Maxim 21 Dec 19, 2022
Cave Generation using metaballs in Blender. Originally created by sdfgeoff, Edited by Myself (Archie Jaskowicz).

Blender-Cave-Generation Cave Generation using metaballs in Blender. Originally created by sdfgeoff, Edited by Myself (Archie Jaskowicz). Installation

2 Dec 28, 2022