Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

Last update: Dec 06, 2022

Related tags

Overview

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection

This project provides an implementation for "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" on PyTorch.

LLA is the first one-stage detector that surpasses two-stage detectors (e.g., Faster R-CNN) on CrowdHuman dataset. Experiments in the paper were conducted on the internal framework, thus we reimplement them on cvpods and report details as below.

Requirements

cvpods

Get Started

install cvpods locally (requires cuda to compile)

python3 -m pip install 'git+https://github.com/Megvii-BaseDetection/cvpods.git'
# (add --user if you don't have permission)

# Or, to install it from a local clone:
git clone https://github.com/Megvii-BaseDetection/cvpods.git
python3 -m pip install -e cvpods

# Or,
pip install -r requirements.txt
python3 setup.py build develop

prepare datasets

cd /path/to/cvpods/datasets
ln -s /path/to/your/crowdhuman/dataset crowdhuman

Train & Test

git clone https://github.com/Megvii-BaseDetection/LLA.git
cd LLA/playground/detection/crowdhuman/lla.res50.fpn.crowdhuman.800size.30k  # for example

# Train
pods_train --num-gpus 8

# Test
pods_test --num-gpus 8 \
    MODEL.WEIGHTS /path/to/your/save_dir/ckpt.pth # optional
    OUTPUT_DIR /path/to/your/save_dir # optional

# Multi node training
## sudo apt install net-tools ifconfig
pods_train --num-gpus 8 --num-machines N --machine-rank 0/1/.../N-1 --dist-url "tcp://MASTER_IP:port"

Results on CrowdHuman val set

Model	Backbone	LR Sched.	Aux. Branch	NMS Thr.	MR	AP50	Recall	Download
FCOS	Res50	30k	CenterNess	0.6	54.4	86.0	94.1	weights
ATSS	Res50	30k	CenterNess	0.6	49.4	87.3	94.1	weights
Faster R-CNN	Res50	30k	-	0.5	48.5	84.3	87.1	weights
LLA.FCOS	Res50	30k	IoU	0.6	47.5	88.2	94.4	weights

Acknowledgement

This repo is developed based on cvpods. Please check cvpods for more details and features.

License

This repo is released under the Apache 2.0 license. Please see the LICENSE file for more information.

Citing

If you use this work in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@article{ge2021lla,
  title={LLA: Loss-aware Label Assignment for Dense Pedestrian Detection},
  author={Ge, Zheng and Wang, Jianfeng and Huang, Xin and Liu, Songtao and Yoshie, Osamu},
  journal={arXiv preprint arXiv:2101.04307},
  year={2021}
}

Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

Related tags

Overview

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection

Requirements

Get Started

Results on CrowdHuman val set

Acknowledgement

License

Citing

Owner

Safe Model-Based Reinforcement Learning using Robust Control Barrier Functions

Dataloader tools for language modelling

Code of the paper "Shaping Visual Representations with Attributes for Few-Shot Learning (ASL)".

The coda and data for "Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach" (ACL '21)

Tensorflow Implementation of Pixel Transposed Convolutional Networks (PixelTCN and PixelTCL)

A Player for Kanye West's Stem Player. Sort of an emulator.

Official repository for "On Improving Adversarial Transferability of Vision Transformers" (2021)

Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

Align and Prompt: Video-and-Language Pre-training with Entity Prompts

Official PyTorch implementation of the NeurIPS 2021 paper StyleGAN3

Multi-Stage Progressive Image Restoration

Robust Lane Detection via Expanded Self Attention (WACV 2022)

Dynamic Multi-scale Filters for Semantic Segmentation (DMNet ICCV'2019)

🚩🚩🚩

An Abstract Cyber Security Simulation and Markov Game for OpenAI Gym

Lingvo is a framework for building neural networks in Tensorflow, particularly sequence models.

HNN: Human (Hollywood) Neural Network

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

A video scene detection algorithm is designed to detect a variety of different scenes within a video

A Large Scale Benchmark for Individual Treatment Effect Prediction and Uplift Modeling