End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model

Last update: Dec 30, 2022

Related tags

Deep Learning onnx-facial-lmk-detector

Overview

onnx-facial-lmk-detector

End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model, model.onnx.

Demo

You can try this model at the following link. Thanks for hysts.

https://huggingface.co/spaces/hysts/atksh-onnx-facial-lmk-detector

Code

See src.

Example

import onnxruntime as ort
import cv2

sess = ort.InferenceSession("model.onnx")
img = cv2.imread("input.jpg")

scores, bboxes, keypoints, aligned_imgs, landmarks, affine_matrices = sess.run(None, {"input": img})
# float32 int64 int64 uint8 int64 float32
# (N,) (N, 4) (N, 5, 2) (N, 224, 224, 3) (N, 106, 2) (N, 2, 3)

This model requires onnxruntime>=1.11.

How does it work?

This is simply a merged model of the following underlying models with some pre- and post-processing.

Underlying models

	model	reference
face detection	SCRFD_10G_KPS	https://github.com/deepinsight/insightface/tree/master/detection/scrfd#pretrained-models
landmark detection	2d106det	https://github.com/deepinsight/insightface/blob/master/alignment/coordinate_reg/README.md#pretrained-models

Pre- and Post-Processing

Implemented the following processing by PyTorch and exported to ONNX.

Input transform:
- Resize and pad to (1920, 1920)
- BGR to RGB conversion
- Transpose (H, W, C) to (C, H, W)
(Face Detection)
Post-processing of face detection
- Predicted bounding boxes and Confidence Score Processing
- NMS (ONNX Operator)
Norm estimation and face cropping
- Estimate the norm and apply an affine transformation to each face.
- Crop the faces and resize them to (192, 192).
(Landmark Detection)
Perform post-processing for landmark detection.
- Process the predicted landmarks and apply the inverse affine transform to each face.

Note

Please check with the model provider regarding the license for your use.

This model includes the work that is distributed in the Apache License 2.0.

End-to-end face detection, cropping, norm estimation, and landmark detection in a single onnx model

Related tags

Overview

onnx-facial-lmk-detector

Demo

Code

Example

How does it work?

Underlying models

Pre- and Post-Processing

Note

Owner

atksh

[CVPR'22] Official PyTorch Implementation of Collaborative Transformers for Grounded Situation Recognition

PyTorch implementation of ENet

Simple image captioning model - CLIP prefix captioning.

This project is a loose implementation of paper "Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach"

DSAC* for Visual Camera Re-Localization (RGB or RGB-D)

African language Speech Recognition - Speech-to-Text

Radar-to-Lidar: Heterogeneous Place Recognition via Joint Learning

Framework for joint representation learning, evaluation through multimodal registration and comparison with image translation based approaches

The code for our paper CrossFormer: A Versatile Vision Transformer Based on Cross-scale Attention.

Fine-grained Post-training for Improving Retrieval-based Dialogue Systems - NAACL 2021

💡 Type hints for Numpy

Texture mapping with variational auto-encoders

natural image generation using ConvNets

Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"

VIsually-Pivoted Audio and(N) Text

OptaPlanner wrappers for Python. Currently significantly slower than OptaPlanner in Java or Kotlin.

Code for CVPR 2018 paper --- Texture Mapping for 3D Reconstruction with RGB-D Sensor

This program uses trial auth token of Azure Cognitive Services to do speech synthesis for you.

This project is the PyTorch implementation of our CVPR 2022 paper:

Provably Rare Gem Miner.