Framework for the Complete Gaze Tracking Pipeline

The figure below shows a general representation of the camera-to-screen gaze tracking pipeline [1]. The webcam image is preprocessed to create a normalized image of the eyes and face, from left to right. These images are fed into a model, which predicts the 3D gaze vector. The predicted gaze vector can be projected onto the screen once the user’s head pose is known.
This framework allows for the implementation of a real-time approach to predict the viewing position on the screen based only on the input image.

pip install -r requirements.txt
If necessary, calibrate the camera using the provided interactive script python calibrate_camera.py, see Camera Calibration by OpenCV.
For higher accuracy, it is also advisable to calibrate the position of the screen as described by Takahashiet al., which provide an OpenCV and matlab implementation.
To make reliable predictions, the proposed model needs to be specially calibration for each user. A software is provided to collect this calibration data.
Train a model or download a pretrained model.
If all previous steps are fulfilled, python main.py --calibration_matrix_path=./calibration_matrix.yaml --model_path=./p00.ckpt can be executed and a "red laser pointer" should be visible on the screen. main.py also provides multiple visualization options like:
1. --visualize_preprocessing to visualize the preprocessed images
2. --visualize_laser_pointer to show the gaze point the person is looking at on the screen like a red laserpointer dot, see the right monitor on the image below
3. --visualize_3d to visualize the head, the screen, and the gaze vector in a 3D scene, see left monitor on the image below

[1] Amogh Gudi, Xin Li, and Jan van Gemert, “Efficiency in real-time webcam gaze tracking”, in Computer Vision - ECCV 2020 Workshops - Glasgow, UK, August 23-28, 2020, Proceedings, Part I, Adrien Bartoli and Andrea Fusiello, Eds., ser. Lecture Notes in Computer Science, vol. 12535, Springer, 2020, pp. 529–543. DOI : 10.1007/978-3-030-66415-2_34. [Online]. Available: https://doi.org/10.1007/978-3-030-66415-2_34.

Framework for the Complete Gaze Tracking Pipeline

Related tags

Overview

Framework for the Complete Gaze Tracking Pipeline

Owner

Pascal

A novel region proposal network for more general object detection ( including scene text detection ).

An interactive document scanner built in Python using OpenCV

Machine Leaning applied to denoise images to improve OCR Accuracy

OpenCVを用いたカメラキャリブレーションのサンプルです。2021/06/21時点でPython実装のある3種類(通常カメラ向け、魚眼レンズ向け(fisheyeモジュール)、全方位カメラ向け(omnidirモジュール))について用意しています。

Hiiii this is the Spanish for Linux and win 10 and in the near future the english version of PortScan my new tool on which you can see what ports are Open only with the IP adress.

Convert PDF/Image to TXT using EasyOcr - the best OCR engine available!

Introduction to Augmented Reality (AR) with Python 3 and OpenCV 4.2.

Face Detection with DLIB

A bot that plays TFT using OCR. Keeps track of bench, board, items, and plays the user defined team comp.

Tesseract Open Source OCR Engine (main repository)

Memory tests solver with using OpenCV

This is the code for our paper DAAIN: Detection of Anomalous and AdversarialInput using Normalizing Flows

Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

Msos searcher - A half-hearted attempt at finding a magic square of squares

Detect textlines in document images

SRA's seminar on Introduction to Computer Vision Fundamentals

This repository contains the code for the paper "SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks"

Sort By Face

Augmenting Anchors by the Detector Itself

Run tesseract with the tesserocr bindings with @OCR-D's interfaces