This tool uses Deep Learning to help you draw and write with your hand and webcam.

Last update: Dec 10, 2022

Related tags

Overview

air-drawing 👆

This tool uses Deep Learning to help you draw and write with your hand and webcam. A Deep Learning model is used to try to predict whether you want to have 'pencil up' or 'pencil down'.

Try it online : loicmagne.github.io/air-drawing

Technical Details

This pipeline is made up of two steps: detecting the hand, and predicting the drawing. Both steps are done using Deep Learning.
The handpose detection is performed using MediaPipe toolbox
The drawing prediction part uses only the finger position, not the image. The input is a sequence of 2D points (actually i'm using the speed and acceleration of the finger instead of the position to make the prediction translation-invariant), and the output is a binary classification 'pencil up' or 'pencil down'. I used a simple bidirectionnal LSTM architecture. I made a small dataset myself (~50 samples) which I annotated thanks to tools provided in the python-stuff/data-wrangling/. At first I wanted to make the 'pencil up'/'pencil down' prediction in real-time, i.e. make the predictions at the same time the user draws. However this task was too difficult and I had poor results, which is why I'm now using bidirectionnal LSTM. You can find details of the deep learning pipeline in the jupyter-notebook in python-stuff/deep-learning/
The application is entirely client-side. I deployed the deep learning model by converting the PyTorch model to .onnx, and then using the ONNX Runtime which is very convenient and compatible with a lot of layers.

Going Forward

Overall the pipeline still struggles and needs some improvement. Ideas of amelioration include :

Having a bigger dataset, with more diverse user data.
Process and smooth the finger signal, to be less dependent on camera quality, and to improve model generalization.

This tool uses Deep Learning to help you draw and write with your hand and webcam.

Related tags

Overview

air-drawing 👆

Technical Details

Going Forward

Owner

lmagne

🛠️ SLAMcore SLAM Utilities

Caffe models in TensorFlow

Attention-driven Robot Manipulation (ARM) which includes Q-attention

Fully Automatic Page Turning on Real Scores

Code for Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs

Pytorch implementation AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

✔️ Visual, reactive testing library for Julia. Time machine included.

Kernel Point Convolutions

GoodNews Everyone! Context driven entity aware captioning for news images

Deep deconfounded recommender (Deep-Deconf) for paper "Deep causal reasoning for recommendations"

TensorFlow-based implementation of "ICNet for Real-Time Semantic Segmentation on High-Resolution Images".

The official code repository for examples in the O'Reilly book 'Generative Deep Learning'

Cross-Task Consistency Learning Framework for Multi-Task Learning

Doing fast searching of nearest neighbors in high dimensional spaces is an increasingly important problem

Embeds a story into a music playlist by sorting the playlist so that the order of the music follows a narrative arc.

Elastic weight consolidation technique for incremental learning.

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

Astrostatistics class for the MSc degree in Astrophysics at the University of Milan-Bicocca (Italy)

FluidNet re-written with ATen tensor lib