Gesture-Detection-and-Depth-Estimation

This is my graduation project.

(1) In this project, I use the YOLOv3 object detection model to detect gesture in RGB image. I trained the model on the self-made gesture dataset to obtain the gesture detection model based on deep learning. Then by testing the model on the test dataset, I found that the model can meet the requirements of real-time gesture detection while maintaining high accuracy.

(2) Then I tried to use the monocular depth estimation algorithm based on depth learning to estimate the depth of gesture object from a single RGB image, including FastDepth algorithm and the improved detection model based on YOLOv3. The FastDepth algorithm is trained and tested on the self-made gesture-depth dataset. Then, by adding a depth vector to output dimensions and modifying the loss function, the function of estimating target depth is added to the YOLOv3 model. Then I trained and tested the modified YOLOv3 model on the same gesture-depth dataset. Finally, the experiment results show that both methods can estimate the depth information of gesture object in RGB image to a certain extent.

Gesture detection:

Depth data:

Estimate target depth：

(3) Also, I developed a simple program with PyOpenGL that can use gesture information to draw simple shapes in three-dimensional space.

Try to draw a cube:

For more information, you can check my final paper.

YOLOv3 model is based on coldlarry's model: https://github.com/coldlarry/YOLOv3-complete-pruning

Graduation Project

Related tags

Overview

Gesture-Detection-and-Depth-Estimation

Owner

ChaosAT

MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research

Collection of sports betting AI tools.

Human Detection - Pedestrian Detection using OpenCV Python

PyTorch deep learning projects made easy.

Sketch-Based 3D Exploration with Stacked Generative Adversarial Networks

Unofficial PyTorch implementation of MobileViT.

Pytorch implementation for "Open Compound Domain Adaptation" (CVPR 2020 ORAL)

Use of Attention Gates in a Convolutional Neural Network / Medical Image Classification and Segmentation

LibFewShot: A Comprehensive Library for Few-shot Learning.

Project for tracking occupancy in Tel-Aviv parking lots.

Enigma-Plus - Python based Enigma machine simulator with some extra features

A CROSS-MODAL FUSION NETWORK BASED ON SELF-ATTENTION AND RESIDUAL STRUCTURE FOR MULTIMODAL EMOTION RECOGNITION

Code for our SIGCOMM'21 paper "Network Planning with Deep Reinforcement Learning".

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation(mCOLT/mRASP2), ACL2021

A PyTorch implementation of "ANEMONE: Graph Anomaly Detection with Multi-Scale Contrastive Learning", CIKM-21

Training Structured Neural Networks Through Manifold Identification and Variance Reduction

The Easy-to-use Dialogue Response Selection Toolkit for Researchers

An auto discord account and token generator. Automatically verifies the phone number. Works without proxy. Bypasses captcha.

Official Pytorch implementation for "End2End Occluded Face Recognition by Masking Corrupted Features, TPAMI 2021"

Pytorch implementation AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks