WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

Yijun Zhou and James Gregson - BMVC2020

Abstract: We present an end-to-end head-pose estimation network designed to predict Euler angles through the full range head yaws from a single RGB image. Existing methods perform well for frontal views but few target head pose from all viewpoints. This has applications in autonomous driving and retail. Our network builds on multi-loss approaches with changes to loss functions and training strategies adapted to wide range estimation. Additionally, we extract ground truth labelings of anterior views from a current panoptic dataset for the first time. The resulting Wide Headpose Estimation Network (WHENet) is the first fine-grained modern method applicable to the full-range of head yaws (hence wide) yet also meets or beats state-of-the-art methods for frontal head pose estimation. Our network is compact and efficient for mobile devices and applications. ArXiv

Demo

We provided two use case of the WHENet, image input and video input in this repo. Please make sure you installed all the requirments before running the demo code by pip install -r requirements.txt. Additionally, please download the YOLOv3 model for head detection and put it under yolo_v3/data.

Image demo

To run WHENet with image input, please put images and bbox.txt under one folder (E.g. Sample/) and just run pthon demo.py.

Format of bbox.txt are showed below:

image_name,x_min y_min x_max y_max
mov_001_007585.jpeg,240 0 304 83

Video/Webcam demo

We used YOLO_v3 in the video demo to get the cropped head image. In order to customize some of the functions we have put the yolo implementation and the pre-trained model in the repo. Hollywood head and Crowdhuman are used to train the head detection YOLO model.

demo_video.py [--video INPUT_VIDEO_PATH] [--snapshot WHENET_MODEL] [--display DISPLAY_OPTION] 
              [--score YOLO_CONFIDENCE_THRESHOLD] [--iou IOU_THRESHOLD] [--gpu GPU#] [--output OUTPUT_VIDEO_PATH]

Please set --video '' for webcam input.

Dependncies

EfficientNet https://github.com/qubvel/efficientnet
Yolo_v3 https://github.com/qqwweee/keras-yolo3

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

Related tags

Overview

WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose

Demo

Image demo

Video/Webcam demo

Dependncies

Owner

🧠 A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation.', ECCV 2016

This repo contains research materials released by members of the Google Brain team in Tokyo.

A repo to show how to use custom dataset to train s2anet, and change backbone to resnext101

An implementation of chunked, compressed, N-dimensional arrays for Python.

EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale

A method to perform unsupervised cross-region adaptation of crop classifiers trained with satellite image time series.

Copy Paste positive polyp using poisson image blending for medical image segmentation

Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking

The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`

AI Summer's complete catalog of articles

[ICCV 2021] Official PyTorch implementation for Deep Relational Metric Learning.

ivadomed is an integrated framework for medical image analysis with deep learning.

Human motion synthesis using Unity3D

TensorFlow code for the neural network presented in the paper: "Structural Language Models of Code" (ICML'2020)

Automatic caption evaluation metric based on typicality analysis.

A Runtime method overload decorator which should behave like a compiled language

FID calculation with proper image resizing and quantization steps

Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow

git《Joint Entity and Relation Extraction with Set Prediction Networks》(2020) GitHub:

Build an Amazon SageMaker Pipeline to Transform Raw Texts to A Knowledge Graph