End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model

Last update: Dec 30, 2022

Related tags

Deep Learning onnx-facial-lmk-detector

Overview

onnx-facial-lmk-detector

End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model, model.onnx.

Demo

You can try this model at the following link. Thanks for hysts.

https://huggingface.co/spaces/hysts/atksh-onnx-facial-lmk-detector

Code

See src.

Example

import onnxruntime as ort
import cv2

sess = ort.InferenceSession("model.onnx")
img = cv2.imread("input.jpg")

scores, bboxes, keypoints, aligned_imgs, landmarks, affine_matrices = sess.run(None, {"input": img})
# float32 int64 int64 uint8 int64 float32
# (N,) (N, 4) (N, 5, 2) (N, 224, 224, 3) (N, 106, 2) (N, 2, 3)

This model requires onnxruntime>=1.11.

How does it work?

This is simply a merged model of the following underlying models with some pre- and post-processing.

Underlying models

	model	reference
face detection	SCRFD_10G_KPS	https://github.com/deepinsight/insightface/tree/master/detection/scrfd#pretrained-models
landmark detection	2d106det	https://github.com/deepinsight/insightface/blob/master/alignment/coordinate_reg/README.md#pretrained-models

Pre- and Post-Processing

Implemented the following processing by PyTorch and exported to ONNX.

Input transform:
- Resize and pad to (1920, 1920)
- BGR to RGB conversion
- Transpose (H, W, C) to (C, H, W)
(Face Detection)
Post-processing of face detection
- Predicted bounding boxes and Confidence Score Processing
- NMS (ONNX Operator)
Norm estimation and face cropping
- Estimate the norm and apply an affine transformation to each face.
- Crop the faces and resize them to (192, 192).
(Landmark Detection)
Perform post-processing for landmark detection.
- Process the predicted landmarks and apply the inverse affine transform to each face.

Note

Please check with the model provider regarding the license for your use.

This model includes the work that is distributed in the Apache License 2.0.

End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model

Related tags

Overview

onnx-facial-lmk-detector

Demo

Code

Example

How does it work?

Underlying models

Pre- and Post-Processing

Note

Owner

atksh

Astrostatistics class for the MSc degree in Astrophysics at the University of Milan-Bicocca (Italy)

A simple code to perform canny edge contrast detection on images.

Adversarial Graph Representation Adaptation for Cross-Domain Facial Expression Recognition (AGRA, ACM 2020, Oral)

Stacked Recurrent Hourglass Network for Stereo Matching

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

Pywonderland - A tour in the wonderland of math with python.

Unofficial PyTorch implementation of Attention Free Transformer (AFT) layers by Apple Inc.

GPU Programming with Julia - course at the Swiss National Supercomputing Centre (CSCS), ETH Zurich

text_recognition_toolbox: The reimplementation of a series of classical scene text recognition papers with Pytorch in a uniform way.

Symbolic Music Generation with Diffusion Models

Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at [email protected]

ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation

Implementation for our ICCV 2021 paper: Dual-Camera Super-Resolution with Aligned Attention Modules

This is the source code of the 1st place solution for segmentation task (with Dice 90.32%) in 2021 CCF BDCI challenge.

VideoGPT: Video Generation using VQ-VAE and Transformers

Code base for reproducing results of I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics. NeurIPS (2021)

Python Classes: Medical Insurance Project using Object Oriented Programming Concepts

[CVPR 2021] "Multimodal Motion Prediction with Stacked Transformers": official code implementation and project page.

Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)

Clockwork Convnets for Video Semantic Segmentation