Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

Overview

Sign Language Recognition Service

This is a Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition. The service was developed as a part of a bachelor project at Aalborg University.

alt text

Requirements

  • Python 3.7
  • OpenPose 1.6.0
  • CUDA 10.0
  • cuDNN 7.5.0
  • Numpy 1.18.5
  • OpenCV 4.5.1.48
  • Flask 1.1.2
  • Tensorflow 2.0.0
  • Pandas 1.1.5
  • Tensorboard
  • Matplotlib
  • Seaborn
  • Scikit-Learn

How to use

Installing OpenPose

  1. Please install OpenPose 1.6.0 for Python by following the official guide. Note that the newest release on the OpenPose github is 1.7.0 - for this service to work, 1.6.0 must be used.

    A few things to note when installing OpenPose:

    • When cloning the OpenPose repository, use the following git command to get version 1.6.0:
      git clone --depth 1 --branch v1.6.0 https://github.com/CMU-Perceptual-Computing-Lab/openpose
      
    • Remember to run the following command on the newly cloned repository:
      git submodule update --init --recursive --remote
      
    • Use Visual Studio Enterprise 2017 to build the required files. Install this first if you do not already have it.
    • Install CUDA 10.0 and cuDNN 7.5.0 for CUDA 10.0 after installing Visual Studio Enterprise 2017.
    • When generating the files using CMake, make sure that the BUILD_PYTHON flag is enabled, and that the Python version is set to 3.7. Also make sure that the detected CUDA version is 10.0.
    • After building with Visual Studio Enterprise 2017, make sure that all necessary files have been generated.
      • There should be a openpose.dll in /x64/Release/
      • There should be a openpose.exp and openpose.lib in /src/openpose/Release/
      • There should be a pyopenpose.cp37-win_amd64.pyd in /python/openpose/Release/
  2. Install requirements from requirements.txt

  3. Change the path in main/openpose/paths.py to the path of your OpenPose installation:

    # Change this path so it points to your OpenPose path relative to this file
    OPEN_POSE_PATH = get_relative_path(__file__, '../../../../openpose')
    
  4. If you get any errors related to OpenPose when running the service, please go back and make sure that all instructions have been followed - be particularly careful to install the correct CUDA/cuDNN versions, make sure that the BUILD_PYTHON flag was enabled and that Python 3.7 was used when generating the files.

When OpenPose is successfully installed, you can either use the existing model trained on our dataset, or you can choose to make your own dataset and train a model on this instead.

alt text

Using the service

A singular endpoint '/recognize' has been created in order to perform recognition, which allows for POST requests to be made. The endpoint expects a sequence of base64 images, which will get converted into a suitable format recognizable by the classifier.

alt text

alt text

Creating a custom dataset

In order to create a custom dataset, you can access the file create_dataset.py and change the following constant:

DATASET_NAME = 'dsl_dataset'

Such that the path in the constant DATASET_DIR points to a folder where the dataset is located. This folder should contain another folder called 'src', which contains folders for all the desired labels in the dataset. Each of these folders should contain videos of the corresponding sign.

Before running the script, the following constants can be tweaked based on the desired settings:

WINDOW_LENGTH = 60
STRIDE = 5
BATCH_SIZE = 512
VAL_SPLIT = 0.2
TEST_SPLIT = 0.1

Finally, the following constant can be changed:

CREATE_RAW_DATA = True

This is because initial feature extraction by OpenPose can be a fairly lengthy process. This allows for the tweaking of the dataset after features have been extracted, by setting this to False. Note that the raw OpenPose data must be created before the actual dataset can be created, so it is necessary to do this at least once.

Training a custom model

In order to train a custom model you can make use of the train_models.py file. Here, the constant DATASET_NAME can be changed to reflect the name of the dataset you wish to use, such that the DATASET_DIR points to the correct folder. Furthermore, you can specify a tensorboard directory:

DATASET_NAME = 'dsl_dataset'
DATASET_DIR = f'.\\main\\algorithm\\datasets\\{DATASET_NAME}'
MODELS_DIR = f'.\\main\\algorithm\\models\\{DATASET_NAME}'
TENSORBOARD_DIR = f'{MODELS_DIR}\\logs'

Before running the script, you can tweak various training settings as well as the hyper parameters of the model by changing the following constants:

MODEL_NAME = "model"
EPOCHS = 25
LAYER_SIZES = [64]
DENSE_LAYERS = [0]
DENSE_ACTIVATION = "relu"
LSTM_LAYERS = [2]
LSTM_ACTIVATION = "tanh"
OUTPUT_ACTIVATION = "softmax"

Note that the trainer can train multiple models depending on these settings. Changing the LAYER_SIZES, DENSE_LAYERS and LSTM_LAYERS to contain several values will result in a model being trained for each possible combination.

After training your model, you should change the paths.py located in main/core/ to reflect the path to the new model by changing the constant MODEL_NAME to the name of your model:

MODEL_NAME = 'dsl_lstm.model'

Finally, it also possible to generate a confusion matrix for your model by using the generate_confusion_matrix.py script. Here, you simply change the constants DATASET_NAME and MODEL_NAME such that the DATASET_DIR points to your dataset directory, and MODEL_DIR points to your model directory, respectively:

DATASET_NAME = "dsl_dataset"
MODEL_NAME = "dsl_lstm"
DATASET_DIR = f"./main/algorithm/datasets/{DATASET_NAME}/{DATASET_NAME}.pickle"
MODEL_DIR = f"./main/algorithm/models/{DATASET_NAME}/{MODEL_NAME}"

Happy signing :O)

Authors

  • Adil Cemalovic
  • Martin Lønne
  • Magnus Helleshøj Lund
Owner
Martin Lønne
Full-stack software developer with an interest in Cloud development. Is working most with Javascript, C#, and Python for machine learning.
Martin Lønne
CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch

CRAFT-Reimplementation Note:If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 . Reimple

453 Dec 28, 2022
⛓ marc is a small, but flexible Markov chain generator

About marc (markov chain) is a small, but flexible Markov chain generator. Usage marc is easy to use. To build a MarkovChain pass the object a sequenc

Max Humber 65 Oct 27, 2022
Image processing using OpenCv

Image processing using OpenCv Write a program that opens the webcam, and the user selects one of the following on the video: ✅ If the user presses the

M.Najafi 4 Feb 18, 2022
The world's simplest facial recognition api for Python and the command line

Face Recognition You can also read a translated version of this file in Chinese 简体中文版 or in Korean 한국어 or in Japanese 日本語. Recognize and manipulate fa

Adam Geitgey 47k Jan 07, 2023
Basic functions manipulating images using the OpenCV library

OpenCV Basic functions manipulating images using the OpenCV library. Reading Ima

Shatha Siala 3 Feb 17, 2022
CNN+Attention+Seq2Seq

Attention_OCR CNN+Attention+Seq2Seq The model and its tensor transformation are shown in the figure below It is necessary ch_ train and ch_ test the p

Tsukinousag1 2 Jul 14, 2022
This is a Computer vision package that makes its easy to run Image processing and AI functions. At the core it uses OpenCV and Mediapipe libraries.

CVZone This is a Computer vision package that makes its easy to run Image processing and AI functions. At the core it uses OpenCV and Mediapipe librar

CVZone 648 Dec 30, 2022
Tesseract Open Source OCR Engine (main repository)

Tesseract OCR About This package contains an OCR engine - libtesseract and a command line program - tesseract. Tesseract 4 adds a new neural net (LSTM

48.4k Jan 09, 2023
A curated list of papers and resources for scene text detection and recognition

Awesome Scene Text A curated list of papers and resources for scene text detection and recognition The year when a paper was first published, includin

Jan Zdenek 43 Mar 15, 2022
BoxToolBox is a simple python application built around the openCV library

BoxToolBox is a simple python application built around the openCV library. It is not a full featured application to guide you through the w

František Horínek 1 Nov 12, 2021
This is a real life mario project using python and mediapipe

real-life-mario This is a real life mario project using python and mediapipe How to run to run this just run - realMario.py file requirements This req

Programminghut 42 Dec 22, 2022
Repository for playing the computer vision apps: People analytics on Raspberry Pi.

play-with-torch Repository for playing the computer vision apps: People analytics on Raspberry Pi. Tools Tested Hardware RasberryPi 4 Model B here, RA

eMHa 1 Sep 23, 2021
pulse2percept: A Python-based simulation framework for bionic vision

pulse2percept: A Python-based simulation framework for bionic vision Retinal degenerative diseases such as retinitis pigmentosa and macular degenerati

67 Dec 29, 2022
Write-ups for the SwissHackingChallenge2021 CTF.

SwissHackingChallenge 2021 : Write-ups This repository contains a collection of my write-ups for challenges solved during the SwissHackingChallenge (S

Julien Béguin 3 Jun 07, 2021
This repository provides train&test code, dataset, det.&rec. annotation, evaluation script, annotation tool, and ranking.

SCUT-CTW1500 Datasets We have updated annotations for both train and test set. Train: 1000 images [images][annos] Additional point annotation for each

Yuliang Liu 600 Dec 18, 2022
This is the code for our paper DAAIN: Detection of Anomalous and AdversarialInput using Normalizing Flows

Merantix-Labs: DAAIN This is the code for our paper DAAIN: Detection of Anomalous and Adversarial Input using Normalizing Flows which can be found at

Merantix 14 Oct 12, 2022
Detect the mathematical formula from the given picture and the same formula is extracted and converted into the latex code

Mathematical formulae extractor The goal of this project is to create a learning based system that takes an image of a math formula and returns corres

6 May 22, 2022
Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation, CVPR 2020 (Oral)

SEAM The implementation of Self-supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentaion. You can also download the repos

Hibercraft 459 Dec 26, 2022
Recognizing cropped text in natural images.

ASTER: Attentional Scene Text Recognizer with Flexible Rectification ASTER is an accurate scene text recognizer with flexible rectification mechanism.

Baoguang Shi 681 Jan 02, 2023
An interactive document scanner built in Python using OpenCV

The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive

Kushal Shingote 1 Feb 12, 2022