基于openpose和图像分类的手语识别项目

Overview

手语识别


0、使用到的模型

(1). openpose,作者:CMU-Perceptual-Computing-Lab

https://github.com/CMU-Perceptual-Computing-Lab/openpose

(2). 图像分类classification,作者:Bubbliiiing

https://github.com/bubbliiiing/classification-pytorch

B站对应视频:https://www.bilibili.com/video/BV143411B7wg

(3). 手语教学视频,作者:二碳碳

https://www.bilibili.com/video/BV1XE41137LV

(感谢大佬们的开源项目和教程,都已star加三连)




1、大致思路

方法一: 将视频输入到openpose中,检测出关节点的变化轨迹,将轨迹绘制在一张图片上,把这张图片传到图像分类网络中检测属于哪个动作

视频  ----->  |  openpose  |-----> 关节点运动轨迹图-------> |  图像分类模型  | ----------> 单词分类  

方法二: 将视频输入到openpose中,检测出每一帧中关节点的位置,将多帧进行堆叠,形成一个三维张量,其中两个维度是图片的宽和高,一个维度是时间,然后对这个三维张量使用三维卷积进行训练和预测

视频  ----->  |  openpose  |-----> 多张关节点位置图 ---------> |  堆叠  | --------> 三维张量 -------> |  三维卷积网络  | ----------> 单词分类  



2、环境配置

python:3.7(其他版本会导致openpose无法运行,建议使用anaconda的python环境)
cuda:10
cudnn:7或8应该都行
(配置cuda和cudnn会比较麻烦,如果实在不想配,你可以去openpose的github网站下载使用cpu的版本,这里这个版本应该不支持cpu)


具体的配置环境方式:

(0).python和cuda和cudnn自己装


(1).下载文件

下载代码文件后,再从网盘下载模型和数据文件(没有这些跑不起来),网盘链接:

链接:https://pan.baidu.com/s/1Q2aVVhMhSfWL4qKS9QslkQ 
提取码:abcd 

将从github下载的文件夹和网盘下载的文件夹合并,然后就可以下一步了。 (当然你大可直接找我要u盘拿完整的文件)


(2).安装requirements.txt中的库

cmd进入环境后,cd到项目文件夹下,执行指令:

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

(3).安装torch和torchvision

先下载好torch(1.2.0)和torchvision(0.4.0)的whl文件,下载地址:

链接:https://pan.baidu.com/s/1QIuJfEE5qQFpXY8ZlHeLNQ 
提取码:abcd 

(当然你依旧可直接找我拿u盘)

下载好torch和torchvision的whl文件后,cmd进入环境,cd到下载文件夹下,执行指令:

pip install [torch或torchvision的whl文件的文件名]

(先装torch再装torchvision,不然有可能会报错)



3、测试运行openpose

项目文件夹下有三个文件:

test.py
test_video.py
test_video_track_point.py

分别对应openpose的功能:检测图片、检测视频、检测视频并绘制关节点轨迹
具体的使用方法可以看文件中的注释部分


在test_video_track_point.py中,取消掉最后几行的注释,就可以将绘制的轨迹图送到classification中去做分类检测
(不过现阶段分类器尚未做好)




4、classification的训练和使用

可以看下classification文件夹中的README.md文件,大佬已经在里边讲得很详细了

Read Japanese manga inside browser with selectable text.

mokuro Read Japanese manga with selectable text inside a browser. See demo: https://kha-white.github.io/manga-demo mokuro_demo.mp4 Demo contains excer

Maciej Budyś 170 Dec 27, 2022
This is a passport scanning web service to help you scan, identify and validate your passport created with a simple and flexible design and ready to be integrated right into your system!

Passport-Recogniton-System This is a passport scanning web service to help you scan, identify and validate your passport created with a simple and fle

Mo'men Ashraf Muhamed 7 Jan 04, 2023
A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database. The structure, shape and proportions of the faces are comp

Pavankumar Khot 4 Mar 19, 2022
Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)

Detecting Text in Natural Image with Connectionist Text Proposal Network The codes are used for implementing CTPN for scene text detection, described

Tian Zhi 1.3k Dec 22, 2022
CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch

CRAFT-Reimplementation Note:If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 . Reimple

453 Dec 28, 2022
RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection

RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection For more details, please refer to our paper. Citing Please cite the related works

Minghui Liao 102 Jun 29, 2022
Python bindings for JIGSAW: a Delaunay-based unstructured mesh generator.

JIGSAW: An unstructured mesh generator JIGSAW is an unstructured mesh generator and tessellation library; designed to generate high-quality triangulat

Darren Engwirda 26 Dec 13, 2022
Deep LearningImage Captcha 2

滑动验证码深度学习识别 本项目使用深度学习 YOLOV3 模型来识别滑动验证码缺口,基于 https://github.com/eriklindernoren/PyTorch-YOLOv3 修改。 只需要几百张缺口标注图片即可训练出精度高的识别模型,识别效果样例: 克隆项目 运行命令: git cl

Python3WebSpider 117 Dec 28, 2022
EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

EQFace: A Simple Explicit Quality Network for Face Recognition The first face recognition network that generates explicit face quality online.

DeepCam Shenzhen 141 Dec 31, 2022
Balabobapy - Using artificial intelligence algorithms to continue the text

Balabobapy - Using artificial intelligence algorithms to continue the text

qxtony 1 Feb 04, 2022
Using python libraries to track hands

Python-HandTracking Using python libraries to track hands on a camera Uses cv2 and mediapipe libraries custom hand tracking module PyCharm IDE Final E

Martin Matsudaira 1 Dec 17, 2021
Contextual speed detection for python

Speed Prediction using Optical Flow and 2D CNN About the challenge: Comma.AI Speed Challenge This challenge was developed by Comma.AI to predict the s

Mahimana Bhatt 2 Dec 16, 2021
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

English | 简体中文 Introduction PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and a

27.5k Jan 08, 2023
LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

LEARN OPENCV IN 3 HOURS USING PYTHON - INCLUDING EXAMPLE PROJECTS

Murtaza Hassan 815 Dec 29, 2022
Document blur detection based on Laplacian operator and text detection.

Document Blur Detection For general blurred image, using the variance of Laplacian operator is a good solution. But as for the blur detection of docum

JoeyLr 5 Oct 20, 2022
Course material for the Multi-agents and computer graphics course

TC2008B Course material for the Multi-agents and computer graphics course. Setup instructions Strongly recommend using a custom conda environment. Ins

16 Dec 13, 2022
A python programusing Tkinter graphics library to randomize questions and answers contained in text files

RaffleOfQuestions Um programa simples em python, utilizando a biblioteca gráfica Tkinter para randomizar perguntas e respostas contidas em arquivos de

Gabriel Ferreira Rodrigues 1 Dec 16, 2021
TedEval: A Fair Evaluation Metric for Scene Text Detectors

TedEval: A Fair Evaluation Metric for Scene Text Detectors Official Python 3 implementation of TedEval | paper | slides Chae Young Lee, Youngmin Baek,

Clova AI Research 167 Nov 20, 2022
Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless. This is the official Roboflow python package that interfaces with the Roboflow API.

Roboflow 52 Dec 23, 2022
ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data

VistaOCR ISI's Optical Character Recognition (OCR) software for machine-print and handwriting data Publications "How to Efficiently Increase Resolutio

ISI Center for Vision, Image, Speech, and Text Analytics 21 Dec 08, 2021