This tool uses Deep Learning to help you draw and write with your hand and webcam.

Last update: Dec 10, 2022

Related tags

Overview

air-drawing 👆

This tool uses Deep Learning to help you draw and write with your hand and webcam. A Deep Learning model is used to try to predict whether you want to have 'pencil up' or 'pencil down'.

Try it online : loicmagne.github.io/air-drawing

Technical Details

This pipeline is made up of two steps: detecting the hand, and predicting the drawing. Both steps are done using Deep Learning.
The handpose detection is performed using MediaPipe toolbox
The drawing prediction part uses only the finger position, not the image. The input is a sequence of 2D points (actually i'm using the speed and acceleration of the finger instead of the position to make the prediction translation-invariant), and the output is a binary classification 'pencil up' or 'pencil down'. I used a simple bidirectionnal LSTM architecture. I made a small dataset myself (~50 samples) which I annotated thanks to tools provided in the python-stuff/data-wrangling/. At first I wanted to make the 'pencil up'/'pencil down' prediction in real-time, i.e. make the predictions at the same time the user draws. However this task was too difficult and I had poor results, which is why I'm now using bidirectionnal LSTM. You can find details of the deep learning pipeline in the jupyter-notebook in python-stuff/deep-learning/
The application is entirely client-side. I deployed the deep learning model by converting the PyTorch model to .onnx, and then using the ONNX Runtime which is very convenient and compatible with a lot of layers.

Going Forward

Overall the pipeline still struggles and needs some improvement. Ideas of amelioration include :

Having a bigger dataset, with more diverse user data.
Process and smooth the finger signal, to be less dependent on camera quality, and to improve model generalization.

This tool uses Deep Learning to help you draw and write with your hand and webcam.

Related tags

Overview

air-drawing 👆

Technical Details

Going Forward

Owner

lmagne

[ECCV 2020] Gradient-Induced Co-Saliency Detection

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

Paper Title: Heterogeneous Knowledge Distillation for Simultaneous Infrared-Visible Image Fusion and Super-Resolution

The World of an Octopus: How Reporting Bias Influences a Language Model's Perception of Color

i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery

Share a benchmark that can easily apply reinforcement learning in Job-shop-scheduling

How to Predict Stock Prices Easily Demo

A data-driven maritime port simulator

face_recognization (FaceNet) + TFHE (HNP) + hand_face_detection (Mediapipe)

Deep Learning Datasets Maker is a QGIS plugin to make datasets creation easier for raster and vector data.

Official implementation of "StyleCariGAN: Caricature Generation via StyleGAN Feature Map Modulation" (SIGGRAPH 2021)

This repository accompanies the ACM TOIS paper "What can I cook with these ingredients?" - Understanding cooking-related information needs in conversational search

Interactive Image Generation via Generative Adversarial Networks

Snscrape-jsonl-urls-extractor - Extracts urls from jsonl produced by snscrape

Detecting drunk people through thermal images using Deep Learning (CNN)

Code for Mesh Convolution Using a Learned Kernel Basis

This repository contains the code for the paper Neural RGB-D Surface Reconstruction

ANN model for prediction a spatio-temporal distribution of supercooled liquid in mixed-phase clouds using Doppler cloud radar spectra.

The codes and related files to reproduce the results for Image Similarity Challenge Track 1.

Predicting Tweet Sentiment Maching Learning and streamlit